Abstract
This paper explores the impact of different back-translation approaches on machine translation for Ladin, specifically the Val Badia variant. Given the limited amount of parallel data available for this language (only 18k Ladin-Italian sentence pairs), we investigate the performance of a multilingual neural machine translation model fine-tuned for Ladin-Italian. In addition to the available authentic data, we synthesise further translations by using three different models: a fine-tuned neural model, a rule-based system developed specifically for this language pair, and a large language model. Our experiments show that all approaches achieve comparable translation quality in this low-resource scenario, yet round-trip translations highlight differences in model performance.- Anthology ID:
- 2024.loresmt-1.13
- Volume:
- Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024)
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Atul Kr. Ojha, Chao-hong Liu, Ekaterina Vylomova, Flammie Pirinen, Jade Abbott, Jonathan Washington, Nathaniel Oco, Valentin Malykh, Varvara Logacheva, Xiaobing Zhao
- Venues:
- LoResMT | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 128–138
- Language:
- URL:
- https://aclanthology.org/2024.loresmt-1.13
- DOI:
- 10.18653/v1/2024.loresmt-1.13
- Cite (ACL):
- Samuel Frontull and Georg Moser. 2024. Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin. In Proceedings of the Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024), pages 128–138, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- Rule-Based, Neural and LLM Back-Translation: Comparative Insights from a Variant of Ladin (Frontull & Moser, LoResMT-WS 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.loresmt-1.13.pdf