H. Rukshan Dias


2025

pdf bib
A Systematic Review on Machine Translation and Transliteration Techniques for Code-Mixed Indo-Aryan Languages
H. Rukshan Dias | Deshan Sumanathilaka
Proceedings of the Twelfth Workshop on Asian Translation (WAT 2025)

In multilingual societies, it is common to observe the blending of multiple languages in communication, a phenomenon known as Code-mixing. Globalization and the increasing influence of social media have further amplified multilingualism, resulting in a wider use of code-mixing. This systematic review analyzes existing translation and transliteration techniques for code-mixed Indo-Aryan languages, spanning rule-based and statistical approaches to neural machine translation and transformer-based architectures. It also examines publicly available code-mixed datasets designed for machine translation and transliteration tasks, along with the evaluation metrics commonly introduced and applied in prior studies. Finally, the paper discusses current challenges and limitations, highlighting future research directions for developing more tailored translation pipelines for code-mixed Indo-Aryan languages.