Alignment verification to improve NMT translation towards highly inflectional languages with limited resources

George Tambouratzis


Abstract
The present article discusses how to improve translation quality when using limited training data to translate towards morphologically rich languages. The starting point is a neural MT system, used to train translation models, using solely publicly available parallel data. An initial analysis of the translation output has shown that quality is sub-optimal, due mainly to an insufficient amount of training data. To improve translation quality, a hybridized solution is proposed, using an ensemble of relatively simple NMT systems trained with different metrics, combined with an open source module, designed for a low-resource MT system. Experimental results of the proposed hybridized method with multiple independent test sets achieve improvements over (i) both the best individual NMT and (ii) the standard ensemble system provided in the Marian-NMT system. Improvements over Marian-NMT are in many cases statistically significant. Finally, a qualitative analysis of translation results indicates a greater robustness for the hybridized method.
Anthology ID:
2021.eacl-main.158
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Editors:
Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1841–1851
Language:
URL:
https://aclanthology.org/2021.eacl-main.158
DOI:
10.18653/v1/2021.eacl-main.158
Bibkey:
Cite (ACL):
George Tambouratzis. 2021. Alignment verification to improve NMT translation towards highly inflectional languages with limited resources. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1841–1851, Online. Association for Computational Linguistics.
Cite (Informal):
Alignment verification to improve NMT translation towards highly inflectional languages with limited resources (Tambouratzis, EACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2021.eacl-main.158.pdf