Code-Switching and Back-Transliteration Using a Bilingual Model

Daniel Weisberg Mitelman; Nachum Dershowitz; Kfir Bar

Code-Switching and Back-Transliteration Using a Bilingual Model

Daniel Weisberg Mitelman, Nachum Dershowitz, Kfir Bar

Abstract

The challenges of automated transliteration and code-switching–detection in Judeo-Arabic texts are addressed. We introduce two novel machine-learning models, one focused on transliterating Judeo-Arabic into Arabic, and another aimed at identifying non-Arabic words, predominantly Hebrew and Aramaic. Unlike prior work, our models are based on a bilingual Arabic-Hebrew language model, providing a unique advantage in capturing shared linguistic nuances. Evaluation results show that our models outperform prior solutions for the same tasks. As a practical contribution, we present a comprehensive pipeline capable of taking Judeo-Arabic text, identifying non-Arabic words, and then transliterating the Arabic portions into Arabic script. This work not only advances the state of the art but also offers a valuable toolset for making Judeo-Arabic texts more accessible to a broader Arabic-speaking audience.

Anthology ID:: 2024.findings-eacl.102
Volume:: Findings of the Association for Computational Linguistics: EACL 2024
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1501–1511
Language:
URL:: https://aclanthology.org/2024.findings-eacl.102
DOI:
Bibkey:
Cite (ACL):: Daniel Weisberg Mitelman, Nachum Dershowitz, and Kfir Bar. 2024. Code-Switching and Back-Transliteration Using a Bilingual Model. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1501–1511, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Code-Switching and Back-Transliteration Using a Bilingual Model (Weisberg Mitelman et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/proper-vol2-ingestion/2024.findings-eacl.102.pdf

PDF Search