How effective is machine translation on low-resource code-switching? A case study comparing human and automatic metrics

Li Nguyen, Christopher Bryant, Oliver Mayeux, Zheng Yuan


Abstract
This paper presents an investigation into the differences between processing monolingual input and code-switching (CSW) input in the context of machine translation (MT). Specifically, we compare the performance of three MT systems (Google, mBART-50 and M2M-100-big) in terms of their ability to translate monolingual Vietnamese, a low-resource language, and Vietnamese-English CSW respectively. To our knowledge, this is the first study to systematically analyse what might happen when multilingual MT systems are exposed to CSW data using both automatic and human metrics. We find that state-of-the-art neural translation systems not only achieve higher scores on automatic metrics when processing CSW input (compared to monolingual input), but also produce translations that are consistently rated as more semantically faithful by humans. We further suggest that automatic evaluation alone is insufficient for evaluating the translation of CSW input. Our findings establish a new benchmark that offers insights into the relationship between MT and CSW.
Anthology ID:
2023.findings-acl.893
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14186–14195
Language:
URL:
https://aclanthology.org/2023.findings-acl.893
DOI:
10.18653/v1/2023.findings-acl.893
Bibkey:
Cite (ACL):
Li Nguyen, Christopher Bryant, Oliver Mayeux, and Zheng Yuan. 2023. How effective is machine translation on low-resource code-switching? A case study comparing human and automatic metrics. In Findings of the Association for Computational Linguistics: ACL 2023, pages 14186–14195, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
How effective is machine translation on low-resource code-switching? A case study comparing human and automatic metrics (Nguyen et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.findings-acl.893.pdf