Studying The Impact Of Document-level Context On Simultaneous Neural Machine Translation

Raj Dabre, Aizhan Imankulova, Masahiro Kaneko


Abstract
In a real-time simultaneous translation setting and neural machine translation (NMT) models start generating target language tokens from incomplete source language sentences and making them harder to translate and leading to poor translation quality. Previous research has shown that document-level NMT and comprising of sentence and context encoders and a decoder and leverages context from neighboring sentences and helps improve translation quality. In simultaneous translation settings and the context from previous sentences should be even more critical. To this end and in this paper and we propose wait-k simultaneous document-level NMT where we keep the context encoder as it is and replace the source sentence encoder and target language decoder with their wait-k equivalents. We experiment with low and high resource settings using the ALT and OpenSubtitles2018 corpora and where we observe minor improvements in translation quality. We then perform an analysis of the translations obtained using our models by focusing on sentences that should benefit from the context where we found out that the model does and in fact and benefit from context but is unable to effectively leverage it and especially in a low-resource setting. This shows that there is a need for further innovation in the way useful context is identified and leveraged.
Anthology ID:
2021.mtsummit-research.17
Volume:
Proceedings of Machine Translation Summit XVIII: Research Track
Month:
August
Year:
2021
Address:
Virtual
Editors:
Kevin Duh, Francisco Guzmán
Venue:
MTSummit
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
202–214
Language:
URL:
https://aclanthology.org/2021.mtsummit-research.17
DOI:
Bibkey:
Cite (ACL):
Raj Dabre, Aizhan Imankulova, and Masahiro Kaneko. 2021. Studying The Impact Of Document-level Context On Simultaneous Neural Machine Translation. In Proceedings of Machine Translation Summit XVIII: Research Track, pages 202–214, Virtual. Association for Machine Translation in the Americas.
Cite (Informal):
Studying The Impact Of Document-level Context On Simultaneous Neural Machine Translation (Dabre et al., MTSummit 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2021.mtsummit-research.17.pdf