Generating and Analyzing Disfluency in a Code-Mixed Setting
Aryan Paul, Tapabrata Mondal, Dipankar Das, Sivaji Bandyopadhyay
Abstract
This work explores the intersection of code-mixing and disfluency in bilingual speech and text, with a focus on understanding how large language models (LLMs) handle code-mixed disfluent utterances. One of the primary objectives is to explore LLMs’ ability to generate code-mixed disfluent sentences and to address the lack of high-quality code-mixed disfluent corpora, particularly for Indic languages. We aim to compare the performance of LLM-based approaches with traditional disfluency detection methods and to develop novel metrics for quantitatively assessing disfluency phenomena. Additionally, we investigate the relationship between code-mixing and disfluency, exploring how factors such as switching frequency and direction influence the occurrence of disfluencies. By analyzing these intriguing dynamics, we seek to gain a deeper understanding of the mutual influence between code-mixing and disfluency in multilingual speech.- Anthology ID:
- 2025.ranlp-1.105
- Volume:
- Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
- Month:
- September
- Year:
- 2025
- Address:
- Varna, Bulgaria
- Editors:
- Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd., Shoumen, Bulgaria
- Note:
- Pages:
- 915–924
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2026-01/2025.ranlp-1.105/
- DOI:
- Cite (ACL):
- Aryan Paul, Tapabrata Mondal, Dipankar Das, and Sivaji Bandyopadhyay. 2025. Generating and Analyzing Disfluency in a Code-Mixed Setting. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 915–924, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
- Cite (Informal):
- Generating and Analyzing Disfluency in a Code-Mixed Setting (Paul et al., RANLP 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2026-01/2025.ranlp-1.105.pdf