Universitat d’Alacant’s Submission to the WMT 2024 Shared Task on Translation into Low-Resource Languages of Spain
Aaron Galiano Jimenez, Víctor M. Sánchez-Cartagena, Juan Antonio Perez-Ortiz, Felipe Sánchez-Martínez
Abstract
This paper describes the submissions of the Transducens group of the Universitat d’Alacant to the WMT 2024 Shared Task on Translation into Low-Resource Languages of Spain; in particular, the task focuses on the translation from Spanish into Aragonese, Aranese and Asturian. Our submissions use parallel and monolingual data to fine-tune the NLLB-1.3B model and to investigate the effectiveness of synthetic corpora and transfer-learning between related languages such as Catalan, Galician and Valencian. We also present a many-to-many multilingual neural machine translation model focused on the Romance languages of Spain.- Anthology ID:
- 2024.wmt-1.85
- Volume:
- Proceedings of the Ninth Conference on Machine Translation
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
- Venue:
- WMT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 885–891
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2024.wmt-1.85/
- DOI:
- 10.18653/v1/2024.wmt-1.85
- Cite (ACL):
- Aaron Galiano Jimenez, Víctor M. Sánchez-Cartagena, Juan Antonio Perez-Ortiz, and Felipe Sánchez-Martínez. 2024. Universitat d’Alacant’s Submission to the WMT 2024 Shared Task on Translation into Low-Resource Languages of Spain. In Proceedings of the Ninth Conference on Machine Translation, pages 885–891, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Universitat d’Alacant’s Submission to the WMT 2024 Shared Task on Translation into Low-Resource Languages of Spain (Galiano Jimenez et al., WMT 2024)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2024.wmt-1.85.pdf