Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context

Felermino D. M. A. Ali, Henrique Lopes Cardoso, Rui Sousa-Silva


Abstract
This research investigates how to improve machine translation systems for low-resource languages by integrating loanword constraints as external linguistic knowledge. Focusing on the Portuguese-Emakhuwa language pair, which exhibits significant lexical borrowing, we address the challenge of effectively adapting loanwords during the translation process. To tackle this, we propose a novel approach that augments source sentences with loanword constraints, explicitly linking source-language loanwords to their target-language equivalents. Then, we perform supervised fine-tuning on multilingual neural machine translation models and multiple Large Language Models of different sizes. Our results demonstrate that incorporating loanword constraints leads to significant improvements in translation quality as well as in handling loanword adaptation correctly in target languages, as measured by different machine translation metrics. This approach offers a promising direction for improving machine translation performance in low-resource settings characterized by frequent lexical borrowing.
Anthology ID:
2025.emnlp-main.1406
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
27631–27645
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1406/
DOI:
Bibkey:
Cite (ACL):
Felermino D. M. A. Ali, Henrique Lopes Cardoso, and Rui Sousa-Silva. 2025. Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 27631–27645, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context (Ali et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1406.pdf
Checklist:
 2025.emnlp-main.1406.checklist.pdf