Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement

HyoJung Han, Seokchan Ahn, Yoonjung Choi, Insoo Chung, Sangha Kim, Kyunghyun Cho


Abstract
Recent work in simultaneous machine translation is often trained with conventional full sentence translation corpora, leading to either excessive latency or necessity to anticipate as-yet-unarrived words, when dealing with a language pair whose word orders significantly differ. This is unlike human simultaneous interpreters who produce largely monotonic translations at the expense of the grammaticality of a sentence being translated. In this paper, we thus propose an algorithm to reorder and refine the target side of a full sentence translation corpus, so that the words/phrases between the source and target sentences are aligned largely monotonically, using word alignment and non-autoregressive neural machine translation. We then train a widely used wait-k simultaneous translation model on this reordered-and-refined corpus. The proposed approach improves BLEU scores and resulting translations exhibit enhanced monotonicity with source sentences.
Anthology ID:
2021.wmt-1.119
Volume:
Proceedings of the Sixth Conference on Machine Translation
Month:
November
Year:
2021
Address:
Online
Editors:
Loic Barrault, Ondrej Bojar, Fethi Bougares, Rajen Chatterjee, Marta R. Costa-jussa, Christian Federmann, Mark Fishel, Alexander Fraser, Markus Freitag, Yvette Graham, Roman Grundkiewicz, Paco Guzman, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Tom Kocmi, Andre Martins, Makoto Morishita, Christof Monz
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1110–1123
Language:
URL:
https://aclanthology.org/2021.wmt-1.119
DOI:
Bibkey:
Cite (ACL):
HyoJung Han, Seokchan Ahn, Yoonjung Choi, Insoo Chung, Sangha Kim, and Kyunghyun Cho. 2021. Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement. In Proceedings of the Sixth Conference on Machine Translation, pages 1110–1123, Online. Association for Computational Linguistics.
Cite (Informal):
Monotonic Simultaneous Translation with Chunk-wise Reordering and Refinement (Han et al., WMT 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/2021.wmt-1.119.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-3/2021.wmt-1.119.mp4
Data
MTNT