Abstract
Why do bilinguals switch languages within a sentence? The present observational study asks whether word surprisal and word entropy predict code-switching in bilingual written conversation. We describe and model a new dataset of Chinese-English text with 1476 clean code-switched sentences, translated back into Chinese. The model includes known control variables together with word surprisal and word entropy. We found that word surprisal, but not entropy, is a significant predictor that explains code-switching above and beyond other well-known predictors. We also found sentence length to be a significant predictor, which has been related to sentence complexity. We propose high cognitive effort as a reason for code-switching, as it leaves fewer resources for inhibition of the alternative language. We also corroborate previous findings, but this time using a computational model of surprisal, a new language pair, and doing so for written language.- Anthology ID:
- 2020.emnlp-main.330
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4029–4039
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.330
- DOI:
- 10.18653/v1/2020.emnlp-main.330
- Cite (ACL):
- Jesús Calvillo, Le Fang, Jeremy Cole, and David Reitter. 2020. Surprisal Predicts Code-Switching in Chinese-English Bilingual Text. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4029–4039, Online. Association for Computational Linguistics.
- Cite (Informal):
- Surprisal Predicts Code-Switching in Chinese-English Bilingual Text (Calvillo et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/2020.emnlp-main.330.pdf