Where and How Do Languages Mix? A Study of Spanish-Guaraní Code-Switching in Paraguay

Olga Kellert, Nemika Tyagi


Abstract
Code-switching, the alternating use of multiple languages within a single utterance, is a widespread linguistic phenomenon that poses unique challenges for both sociolinguistic analysis and Natural Language Processing (NLP). While prior research has explored code-switching from either a syntactic or geographic perspective, few studies have integrated both aspects, particularly for underexplored language pairs like Spanish-Guaraní. In this paper, we analyze Spanish-Guaraní code-switching using a dataset of geotagged tweets from Asunción, Paraguay, collected from 2017 to 2021. We employ a differential distribution method to map the geographic distribution of code-switching across urban zones and analyze its syntactic positioning within sentences. Our findings reveal distinct spatial patterns, with Guaraní-dominant tweets concentrated in the western and southwestern areas, while Spanish-only tweets are more prevalent in central and eastern regions. Syntactic analysis shows that code-switching occurs most frequently in the middle of sentences, often involving verbs, pronouns, and adjectives. These results provide new insights into the interaction between linguistic, social, and geographic factors in bilingual communication. Our study contributes to both sociolinguistic research and NLP applications, offering a framework for analyzing mixed-language data in digital communication.
Anthology ID:
2025.calcs-1.4
Volume:
Proceedings of the 7th Workshop on Computational Approaches to Linguistic Code-Switching
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico, USA
Editors:
Genta Indra Winata, Sudipta Kar, Marina Zhukova, Thamar Solorio, Xi Ai, Injy Hamed, Mahardika Krisna Krisna Ihsani, Derry Tanti Wijaya, Garry Kuwanto
Venues:
CALCS | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
26–31
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.calcs-1.4/
DOI:
Bibkey:
Cite (ACL):
Olga Kellert and Nemika Tyagi. 2025. Where and How Do Languages Mix? A Study of Spanish-Guaraní Code-Switching in Paraguay. In Proceedings of the 7th Workshop on Computational Approaches to Linguistic Code-Switching, pages 26–31, Albuquerque, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Where and How Do Languages Mix? A Study of Spanish-Guaraní Code-Switching in Paraguay (Kellert & Tyagi, CALCS 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.calcs-1.4.pdf