Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses

Prathyusha Jwalapuram; Shafiq Joty; Youlin Shen

doi:10.18653/v1/2020.emnlp-main.177

Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses

Prathyusha Jwalapuram, Shafiq Joty, Youlin Shen

Abstract

Popular Neural Machine Translation model training uses strategies like backtranslation to improve BLEU scores, requiring large amounts of additional data and training. We introduce a class of conditional generative-discriminative hybrid losses that we use to fine-tune a trained machine translation model. Through a combination of targeted fine-tuning objectives and intuitive re-use of the training data the model has failed to adequately learn from, we improve the model performance of both a sentence-level and a contextual model without using any additional data. We target the improvement of pronoun translations through our fine-tuning and evaluate our models on a pronoun benchmark testset. Our sentence-level model shows a 0.5 BLEU improvement on both the WMT14 and the IWSLT13 De-En testsets, while our contextual model achieves the best results, improving from 31.81 to 32 BLEU on WMT14 De-En testset, and from 32.10 to 33.13 on the IWSLT13 De-En testset, with corresponding improvements in pronoun translation. We further show the generalizability of our method by reproducing the improvements on two additional language pairs, Fr-En and Cs-En.

Anthology ID:: 2020.emnlp-main.177
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2267–2279
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2020.emnlp-main.177/
DOI:: 10.18653/v1/2020.emnlp-main.177
Bibkey:
Cite (ACL):: Prathyusha Jwalapuram, Shafiq Joty, and Youlin Shen. 2020. Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 2267–2279, Online. Association for Computational Linguistics.
Cite (Informal):: Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses (Jwalapuram et al., EMNLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2020.emnlp-main.177.pdf
Video:: https://slideslive.com/38939290
Code: ntunlp/pronoun-finetuning

PDF Cite Search Code Video Fix data