Linguistics Department

Colloquium - Evangelia Adamou, French National Center for Scientific Research (CNRS)

Code-switching in Indigenous Languages

Mon, April 16, 2018 | CLA 1.302B

3:00 PM - 5:00 PM


Danny Law

Language mixing and genetic similarity. The case of Tojol-ab'al



Patience Epps

Contrasting linguistic ecologies: Indigenous and colonially-mediated language contact in northwest Amazonia






Barbara Bullock, Gualberto Guzmán, and Almeida Jacqueline Toribio

Metrics for language mixing



Student session: working with small data sets from CILLA and Pangloss



Evangelia Adamou, Director of Research, National Center for Scientific Research (CNRS), Paris
Endangered languages on a scale of language mixing: The Ixcatec documentation corpus (Oaxaca, Mexico)

A widely-accepted prediction in contact linguistics is that the more intensive language contact will be, the more speakers will borrow and/or codeswitch and will exhibit structural convergence with the dominant language (e.g., Thomason & Kaufman 1988). This prediction is even stronger for contact between a minority and a majority language in language shift (Winford 2013). However, such correlations were based on qualitative or questionnaire-based research and not on comparative quantitative analysis of naturally occurring speech. With progress in the development of Natural Language Processing, some researchers now call for the analysis of more comparable bilingual corpus data in order to get a better understanding of how frequent contact phenomena really are in natural human communication (e.g., Guzmán et al. 2016). Adamou (2016), in particular, suggests that a promising contribution to this discussion may come from the growing field of language documentation.

In this talk, I will compare the corpus of the critically-endangered Otomanguean language Ixcatec with 15 bi- or multilingual spoken corpora from a variety of contact settings. The goal is to identify the similarities and differences in language endangerment by taking into consideration both indigenous and non-indigenous settings, echoing some of the research directions in the target article by Mufwene (2017). Analysis of the results shows that most corpora contain less than 5% word tokens from the L2, and more rarely, up to 20%-35% L2 word tokens. Interestingly, it appears that the last fluent speakers of endangered languages do not always codeswitch or/and borrow extensively to the majority language nor do they systematically exhibit convergence, but rather comply with patterns set in the community prior to the shift. In order to get a better understanding of these results, I elaborate a scale of language mixing and correlate it to extra-linguistic factors, including the degree of language endangerment. Finally, I discuss these results in light of recent findings in the literature on bilingualism and L1 attrition (Montrul 2008; Bylund 2009; Schmid, Köpke & de Bot 2012; Schmid 2014). 



