TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula
Antoni Oliver, Mercè Vàzquez, Marta Coll-Florit, Sergi Álvarez, Víctor Suárez, Claudi Aventín-Boya, Cristina Valdés, Mar Font, Alejandro Pardos
Abstract
The main goal of this project is to explore the techniques for training NMT systems applied to Spanish, Portuguese, Catalan, Galician, Asturian, Aragonese and Aranese. These languages belong to the same Romance family, but they are very different in terms of the linguistic resources available. Asturian, Aragonese and Aranese can be considered low resource languages. These characteristics make this setting an excellent place to explore training techniques for low-resource languages: transfer learning and multilingual systems, among others. The first months of the project have been dedicated to the compilation of monolingual and parallel corpora for Asturian, Aragonese and Aranese.- Anthology ID:
- 2023.eamt-1.50
- Volume:
- Proceedings of the 24th Annual Conference of the European Association for Machine Translation
- Month:
- June
- Year:
- 2023
- Address:
- Tampere, Finland
- Editors:
- Mary Nurminen, Judith Brenner, Maarit Koponen, Sirkku Latomaa, Mikhail Mikhailov, Frederike Schierl, Tharindu Ranasinghe, Eva Vanmassenhove, Sergi Alvarez Vidal, Nora Aranberri, Mara Nunziatini, Carla Parra Escartín, Mikel Forcada, Maja Popovic, Carolina Scarton, Helena Moniz
- Venue:
- EAMT
- SIG:
- Publisher:
- European Association for Machine Translation
- Note:
- Pages:
- 495–496
- Language:
- URL:
- https://aclanthology.org/2023.eamt-1.50
- DOI:
- Cite (ACL):
- Antoni Oliver, Mercè Vàzquez, Marta Coll-Florit, Sergi Álvarez, Víctor Suárez, Claudi Aventín-Boya, Cristina Valdés, Mar Font, and Alejandro Pardos. 2023. TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pages 495–496, Tampere, Finland. European Association for Machine Translation.
- Cite (Informal):
- TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula (Oliver et al., EAMT 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.eamt-1.50.pdf