Danae Snchez Villegas


2023

pdf
Sheffield’s Submission to the AmericasNLP Shared Task on Machine Translation into Indigenous Languages
Edward Gow-smith | Danae Snchez Villegas
Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP)

The University of Sheffield took part in the shared task 2023 AmericasNLP for all eleven language pairs. Our models consist of training different variations of NLLB-200 model on data provided by the organizers and available data from various sources such as constitutions, handbooks and news articles. Our models outperform the baseline model on the development set on chrF with substantial improvements particularly for Aymara, Guarani and Quechua. On the test set, our best submission achieves the highest average chrF of all the submissions, we rank first in four of the eleven languages, and at least one of our models ranks in the top 3 for all languages.