LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages

Jared Coleman, Bhaskar Krishnamachari, Ruben Rosales, Khalil Iskarous


Abstract
We propose a new paradigm for machine translation that is particularly useful for no-resource languages (those without any publicly available bilingual or monolingual corpora): LLM-RBMT (LLM-Assisted Rule Based Machine Translation). Using the LLM-RBMT paradigm, we design the first language education/revitalization-oriented machine translator for Owens Valley Paiute (OVP), a critically endangered Indigenous American language for which there is virtually no publicly available data. We present a detailed evaluation of the translator’s components: a rule-based sentence builder, an OVP to English translator, and an English to OVP translator. We also discuss the potential of the paradigm, its limitations, and the many avenues for future research that it opens up.
Anthology ID:
2024.americasnlp-1.9
Volume:
Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Manuel Mager, Abteen Ebrahimi, Shruti Rijhwani, Arturo Oncevay, Luis Chiruzzo, Robert Pugh, Katharina von der Wense
Venues:
AmericasNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
67–87
Language:
URL:
https://aclanthology.org/2024.americasnlp-1.9
DOI:
Bibkey:
Cite (ACL):
Jared Coleman, Bhaskar Krishnamachari, Ruben Rosales, and Khalil Iskarous. 2024. LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages. In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), pages 67–87, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages (Coleman et al., AmericasNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.americasnlp-1.9.pdf
Supplementary material:
 2024.americasnlp-1.9.SupplementaryMaterial.zip