Moses and the Character-Based Random Babbling Baseline: CoAStaL at AmericasNLP 2021 Shared Task
Marcel Bollmann, Rahul Aralikatte, Héctor Murrieta Bello, Daniel Hershcovich, Miryam de Lhoneux, Anders Søgaard
Abstract
We evaluated a range of neural machine translation techniques developed specifically for low-resource scenarios. Unsuccessfully. In the end, we submitted two runs: (i) a standard phrase-based model, and (ii) a random babbling baseline using character trigrams. We found that it was surprisingly hard to beat (i), in spite of this model being, in theory, a bad fit for polysynthetic languages; and more interestingly, that (ii) was better than several of the submitted systems, highlighting how difficult low-resource machine translation for polysynthetic languages is.- Anthology ID:
- 2021.americasnlp-1.28
- Volume:
- Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Editors:
- Manuel Mager, Arturo Oncevay, Annette Rios, Ivan Vladimir Meza Ruiz, Alexis Palmer, Graham Neubig, Katharina Kann
- Venue:
- AmericasNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 248–254
- Language:
- URL:
- https://aclanthology.org/2021.americasnlp-1.28
- DOI:
- 10.18653/v1/2021.americasnlp-1.28
- Cite (ACL):
- Marcel Bollmann, Rahul Aralikatte, Héctor Murrieta Bello, Daniel Hershcovich, Miryam de Lhoneux, and Anders Søgaard. 2021. Moses and the Character-Based Random Babbling Baseline: CoAStaL at AmericasNLP 2021 Shared Task. In Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, pages 248–254, Online. Association for Computational Linguistics.
- Cite (Informal):
- Moses and the Character-Based Random Babbling Baseline: CoAStaL at AmericasNLP 2021 Shared Task (Bollmann et al., AmericasNLP 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2021.americasnlp-1.28.pdf