Abstract
In a multi-lingual and multi-script society such as India, many users resort to code-mixing while typing on social media. While code-mixing has received a lot of attention in the past few years, it has mostly been studied within a single-script scenario. In this work, we present a case study of Hindi-English bilingual Twitter users while considering the nuances that come with the intermixing of different scripts. We present a concise analysis of how scripts and languages interact in communities and cultures where code-mixing is rampant and offer certain insights into the findings. Our analysis shows that both intra-sentential and inter-sentential script-mixing are present on Twitter and show different behavior in different contexts. Examples suggest that script can be employed as a tool for emphasizing certain phrases within a sentence or disambiguating the meaning of a word. Script choice can also be an indicator of whether a word is borrowed or not. We present our analysis along with examples that bring out the nuances of the different cases.- Anthology ID:
- 2020.calcs-1.5
- Volume:
- Proceedings of the The 4th Workshop on Computational Approaches to Code Switching
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Venue:
- CALCS
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 36–44
- Language:
- English
- URL:
- https://aclanthology.org/2020.calcs-1.5
- DOI:
- Cite (ACL):
- Abhishek Srivastava, Kalika Bali, and Monojit Choudhury. 2020. Understanding Script-Mixing: A Case Study of Hindi-English Bilingual Twitter Users. In Proceedings of the The 4th Workshop on Computational Approaches to Code Switching, pages 36–44, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Understanding Script-Mixing: A Case Study of Hindi-English Bilingual Twitter Users (Srivastava et al., CALCS 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.calcs-1.5.pdf