A Review of Discourse-level Machine Translation

Xiaojun Zhang


Abstract
Machine translation (MT) models usually translate a text at sentence level by considering isolated sentences, which is based on a strict assumption that the sentences in a text are independent of one another. However, the fact is that the texts at discourse level have properties going beyond individual sentences. These properties reveal texts in the frequency and distribution of words, word senses, referential forms and syntactic structures. Dissregarding dependencies across sentences will harm translation quality especially in terms of coherence, cohesion, and consistency. To solve these problems, several approaches have previously been investigated for conventional statistical machine translation (SMT). With the fast growth of neural machine translation (NMT), discourse-level NMT has drawn increasing attention from researchers. In this work, we review major works on addressing discourse related problems for both SMT and NMT models with a survey of recent trends in the fields.
Anthology ID:
2020.iwdp-1.2
Volume:
Proceedings of the Second International Workshop of Discourse Processing
Month:
December
Year:
2020
Address:
Suzhou, China
Venue:
iwdp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4–12
Language:
URL:
https://aclanthology.org/2020.iwdp-1.2
DOI:
Bibkey:
Cite (ACL):
Xiaojun Zhang. 2020. A Review of Discourse-level Machine Translation. In Proceedings of the Second International Workshop of Discourse Processing, pages 4–12, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
A Review of Discourse-level Machine Translation (Zhang, iwdp 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2020.iwdp-1.2.pdf