A Case Study on Context Encoding in Multi-Encoder based Document-Level Neural Machine Translation

Ramakrishna Appicharla; Baban Gain; Santanu Pal; Asif Ekbal

A Case Study on Context Encoding in Multi-Encoder based Document-Level Neural Machine Translation

Ramakrishna Appicharla, Baban Gain, Santanu Pal, Asif Ekbal

Abstract

Recent studies have shown that the multi-encoder models are agnostic to the choice of context and the context encoder generates noise which helps in the improvement of the models in terms of BLEU score. In this paper, we further explore this idea by evaluating with context-aware pronoun translation test set by training multi-encoder models trained on three different context settings viz, previous two sentences, random two sentences, and a mix of both as context. Specifically, we evaluate the models on the ContraPro test set to study how different contexts affect pronoun translation accuracy. The results show that the model can perform well on the ContraPro test set even when the context is random. We also analyze the source representations to study whether the context encoder is generating noise or not. Our analysis shows that the context encoder is providing sufficient information to learn discourse-level information. Additionally, we observe that mixing the selected context (the previous two sentences in this case) and the random context is generally better than the other settings.

Anthology ID:: 2023.mtsummit-research.14
Volume:: Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track
Month:: September
Year:: 2023
Address:: Macau SAR, China
Editors:: Masao Utiyama, Rui Wang
Venue:: MTSummit
SIG:
Publisher:: Asia-Pacific Association for Machine Translation
Note:
Pages:: 160–172
Language:
URL:: https://aclanthology.org/2023.mtsummit-research.14
DOI:
Bibkey:
Cite (ACL):: Ramakrishna Appicharla, Baban Gain, Santanu Pal, and Asif Ekbal. 2023. A Case Study on Context Encoding in Multi-Encoder based Document-Level Neural Machine Translation. In Proceedings of Machine Translation Summit XIX, Vol. 1: Research Track, pages 160–172, Macau SAR, China. Asia-Pacific Association for Machine Translation.
Cite (Informal):: A Case Study on Context Encoding in Multi-Encoder based Document-Level Neural Machine Translation (Appicharla et al., MTSummit 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/2023.mtsummit-research.14.pdf

PDF Search