Training and Inference Methods for High-Coverage Neural Machine Translation

Michael Yang, Yixin Liu, Rahul Mayuranath


Abstract
In this paper, we introduce a system built for the Duolingo Simultaneous Translation And Paraphrase for Language Education (STAPLE) shared task at the 4th Workshop on Neural Generation and Translation (WNGT 2020). We participated in the English-to-Japanese track with a Transformer model pretrained on the JParaCrawl corpus and fine-tuned in two steps on the JESC corpus and then the (smaller) Duolingo training corpus. First, during training, we find it is essential to deliberately expose the model to higher-quality translations more often during training for optimal translation performance. For inference, encouraging a small amount of diversity with Diverse Beam Search to improve translation coverage yielded marginal improvement over regular Beam Search. Finally, using an auxiliary filtering model to filter out unlikely candidates from Beam Search improves performance further. We achieve a weighted F1 score of 27.56% on our own test set, outperforming the STAPLE AWS translations baseline score of 4.31%.
Anthology ID:
2020.ngt-1.13
Volume:
Proceedings of the Fourth Workshop on Neural Generation and Translation
Month:
July
Year:
2020
Address:
Online
Venue:
NGT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
119–128
Language:
URL:
https://aclanthology.org/2020.ngt-1.13
DOI:
10.18653/v1/2020.ngt-1.13
Bibkey:
Cite (ACL):
Michael Yang, Yixin Liu, and Rahul Mayuranath. 2020. Training and Inference Methods for High-Coverage Neural Machine Translation. In Proceedings of the Fourth Workshop on Neural Generation and Translation, pages 119–128, Online. Association for Computational Linguistics.
Cite (Informal):
Training and Inference Methods for High-Coverage Neural Machine Translation (Yang et al., NGT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.ngt-1.13.pdf
Video:
 http://slideslive.com/38929827