Experiments in Mamba Sequence Modeling and NLLB-200 Fine-Tuning for Low Resource Multilingual Machine Translation

Dan Degenaro, Tom Lupicki


Abstract
This paper presents DC_DMV’s submission to the AmericasNLP 2024 Shared Task 1: Machine Translation Systems for Indigenous Languages. Our submission consists of two multilingual approaches to building machine translation systems from Spanish to eleven Indigenous languages: fine-tuning the 600M distilled variant of NLLB-200, and an experiment in training from scratch a neural network using the Mamba State Space Modeling architecture. We achieve the best results on the test set for a total of 4 of the language pairs between two checkpoints by fine-tuning NLLB-200, and outperform the baseline score on the test set for 2 languages.
Anthology ID:
2024.americasnlp-1.22
Volume:
Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Manuel Mager, Abteen Ebrahimi, Shruti Rijhwani, Arturo Oncevay, Luis Chiruzzo, Robert Pugh, Katharina von der Wense
Venues:
AmericasNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
188–194
Language:
URL:
https://aclanthology.org/2024.americasnlp-1.22
DOI:
Bibkey:
Cite (ACL):
Dan Degenaro and Tom Lupicki. 2024. Experiments in Mamba Sequence Modeling and NLLB-200 Fine-Tuning for Low Resource Multilingual Machine Translation. In Proceedings of the 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP 2024), pages 188–194, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Experiments in Mamba Sequence Modeling and NLLB-200 Fine-Tuning for Low Resource Multilingual Machine Translation (Degenaro & Lupicki, AmericasNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.americasnlp-1.22.pdf
Supplementary material:
 2024.americasnlp-1.22.SupplementaryMaterial.zip