Links to other sociolinguistic corpora
Below we have listed accessible online sociolinguistic corpora:
Spanish
- CANOLFAN: The Miami corpus consists of conversations by Spanish-English bilinguals
- CESA: Corpus del Español en el sur de Arizona (Corpus of Spanish in Southern Arizona)
- Corpus Del Español
- Corpus of Mexican Spanish in Salinas
- Corpus Sociolingüistico de la Ciudad de México (CSCM)
- COSER: Corpus oral y Sonora del Español Rural
- Lope Blanch Corpus
- PRESEEA: Proyecto para el estudio sociolingüístico del español de España y de América
- Spanish in Texas
Other Languages
- American English - SLAAP: The Sociolinguistic Archive and Analysis Project
- Brazilian Portuguese - ALIP: Amostra Lingüística do Interior Paulista
- Catalan - Ancora
- CORAAL: The Corpus of Regional African American Language
- DECTE: Diachronic Electronic Corpus of Tyneside English
- Dutch - Brieven als Buit
- English ACE: Australian National Corpus
- French - CFPP: Le Corpus de Français Parlé Parisien
- German - The Kiel Corpus
- Hebrew - CoSIH: The Corpus of Spoken Israeli Hebrew
- ICE: International Corpus of English
- Italian - LABLITA: Corpus of Spontaneous Spoken Italian
- KiDKo: KiezDeutsch-Korpus (German in a multilingual neighborhood of Berlin)
- Lithuanian - Vilniečių interviu bazė "Kalba Vilnius" (database of sociolinguistic interviews "Vilnius Speaking")
- PAC: The Phonology of Contemporary English
- PFC: Phonologie du Français Contemporain (Phonology of Contemporary French)
- Portuguese - Amostra da Fala Paulistana
- Several Portuguese Corpura
- SBCSAE: The Santa Barbara Corpus of Spoken American English
- Swedish - The Swedish Spoken Language Corpus at Göteborg University
- Valibel -- Discours et variation (French spoken in Bruxelles)
- WSC: The Wellington Corpus of Spoken New Zealand English
Sign Language
Multiple Languages
- CHILDES: Child Language Data Exchange System (Spanish-English bilingual children)
- Endangered Languages Archive
- HLVC: Heritage Language Variation and Change Project (Various heritage languages spoken in Toronto)
- LDC: The Linguistic Data Consortium
- The New England Corpus of Heritage and Second Language Speakers (Oral and written production of heritage and L2 speakers of Spanish and Portuguese in New England)
- Wikicorpus (Catalan, Spanish and English)
*If you are aware of other accessible online sociolinguistic corpora, please contact us.