Challenges in Context-Aware Neural Machine Translation

Linghao Jin, Jacqueline He, Jonathan May, Xuezhe Ma


Abstract
Context-aware neural machine translation, a paradigm that involves leveraging information beyond sentence-level context to resolve inter-sentential discourse dependencies and improve document-level translation quality, has given rise to a number of recent techniques. However, despite well-reasoned intuitions, most context-aware translation models show only modest improvements over sentence-level systems. In this work, we investigate and present several core challenges that impede progress within the field, relating to discourse phenomena, context usage, model architectures, and document-level evaluation. To address these problems, we propose a more realistic setting for document-level translation, called paragraph-to-paragraph (PARA2PARA) translation, and collect a new dataset of Chinese-English novels to promote future research.
Anthology ID:
2023.emnlp-main.943
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15246–15263
Language:
URL:
https://aclanthology.org/2023.emnlp-main.943
DOI:
10.18653/v1/2023.emnlp-main.943
Bibkey:
Cite (ACL):
Linghao Jin, Jacqueline He, Jonathan May, and Xuezhe Ma. 2023. Challenges in Context-Aware Neural Machine Translation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 15246–15263, Singapore. Association for Computational Linguistics.
Cite (Informal):
Challenges in Context-Aware Neural Machine Translation (Jin et al., EMNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2023.emnlp-main.943.pdf
Video:
 https://preview.aclanthology.org/landing_page/2023.emnlp-main.943.mp4