Investigating Active Learning in Interactive Neural Machine Translation

Kamal Gupta, Dhanvanth Boppana, Rejwanul Haque, Asif Ekbal, Pushpak Bhattacharyya


Abstract
Interactive-predictive translation is a collaborative iterative process and where human translators produce translations with the help of machine translation (MT) systems interactively. Various sampling techniques in active learning (AL) exist to update the neural MT (NMT) model in the interactive-predictive scenario. In this paper and we explore term based (named entity count (NEC)) and quality based (quality estimation (QE) and sentence similarity (Sim)) sampling techniques – which are used to find the ideal candidates from the incoming data – for human supervision and MT model’s weight updation. We carried out experiments with three language pairs and viz. German-English and Spanish-English and Hindi-English. Our proposed sampling technique yields 1.82 and 0.77 and 0.81 BLEU points improvements for German-English and Spanish-English and Hindi-English and respectively and over random sampling based baseline. It also improves the present state-of-the-art by 0.35 and 0.12 BLEU points for German-English and Spanish-English and respectively. Human editing effort in terms of number-of-words-changed also improves by 5 and 4 points for German-English and Spanish-English and respectively and compared to the state-of-the-art.
Anthology ID:
2021.mtsummit-research.2
Volume:
Proceedings of Machine Translation Summit XVIII: Research Track
Month:
August
Year:
2021
Address:
Virtual
Editors:
Kevin Duh, Francisco Guzmán
Venue:
MTSummit
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
10–22
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/2021.mtsummit-research.2/
DOI:
Bibkey:
Cite (ACL):
Kamal Gupta, Dhanvanth Boppana, Rejwanul Haque, Asif Ekbal, and Pushpak Bhattacharyya. 2021. Investigating Active Learning in Interactive Neural Machine Translation. In Proceedings of Machine Translation Summit XVIII: Research Track, pages 10–22, Virtual. Association for Machine Translation in the Americas.
Cite (Informal):
Investigating Active Learning in Interactive Neural Machine Translation (Gupta et al., MTSummit 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/2021.mtsummit-research.2.pdf
Data
Europarl