Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context
Felermino D. M. A. Ali, Henrique Lopes Cardoso, Rui Sousa-Silva
Abstract
This research investigates how to improve machine translation systems for low-resource languages by integrating loanword constraints as external linguistic knowledge. Focusing on the Portuguese-Emakhuwa language pair, which exhibits significant lexical borrowing, we address the challenge of effectively adapting loanwords during the translation process. To tackle this, we propose a novel approach that augments source sentences with loanword constraints, explicitly linking source-language loanwords to their target-language equivalents. Then, we perform supervised fine-tuning on multilingual neural machine translation models and multiple Large Language Models of different sizes. Our results demonstrate that incorporating loanword constraints leads to significant improvements in translation quality as well as in handling loanword adaptation correctly in target languages, as measured by different machine translation metrics. This approach offers a promising direction for improving machine translation performance in low-resource settings characterized by frequent lexical borrowing.- Anthology ID:
- 2025.emnlp-main.1406
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 27631–27645
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1406/
- DOI:
- Cite (ACL):
- Felermino D. M. A. Ali, Henrique Lopes Cardoso, and Rui Sousa-Silva. 2025. Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 27631–27645, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Leveraging Loanword Constraints for Improving Machine Translation in a Low-Resource Multilingual Context (Ali et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1406.pdf