White Paper, Handbook & Resources for Creating Sociolinguistic Corpora
As a part of the National Endowment of Humanities project "Bilingual Voices in the U.S.-Mexico Borderlands," Dr. Bessett, Dr. Carvalho and I wrote a white paper on our findings on technologically-aided transcription methods for bilingual sociolinguistic corpora.
This white paper entitled "Technologically-Aided Transcription Methods for Sociolinguistic Corpora: Findings, Resources, and Considerations" details the project from beginning to end including links to resources and products created during the grant period along with considerations for developing sociolinguistic corpora.
Project Overview The project 'Bilingual Voices in the U.S.-Mexico Borderland: Technology-Enhanced Transcription and Community Engaged Scholarship' piloted technologically-aided transcription methods for two sociolinguistic corpora, supported by the NEH HCRR Planning Grant PW-269430-20. The team aimed to explore and test out several transcription methods in two corpus development internship courses at the University of Texas Rio Grande Valley (UTRGV) and the University of Arizona (UA) during Spring 2021. Due to the extremely time-consuming nature of manual transcription, the team hoped to find a sustainable option for technologically-aided transcription within the context of community engaged scholarship courses for the Corpus Bilingüe del Valle (CoBiVa) which documents the language of the Rio Grande Valley and the Corpus del Español en el Sur de Arizona (CESA) which documents Spanish in Southern Arizona. Beyond this goal, the team also developed plans for the long-term preservation of these collections. (For more details on the preservation plans, please see the Final Performance Report.)
Among the resources and grant products included in the white paper, the team created a CESA & CoBiVa Handbook along with a resources folder for scholars interested in creating community-based sociolinguistic corpora.