Abstract
In this work, we explore multiple neural architectures adapted for the task of automatic post-editing of machine translation output. We focus on neural end-to-end models that combine both inputs mt (raw MT output) and src (source language input) in a single neural architecture, modeling {mt, src} → pe directly. Apart from that, we investigate the influence of hard-attention models which seem to be well-suited for monolingual tasks, as well as combinations of both ideas. We report results on data sets provided during the WMT-2016 shared task on automatic post-editing and can demonstrate that dual-attention models that incorporate all available data in the APE scenario in a single model improve on the best shared task system and on all other published results after the shared task. Dual-attention models that are combined with hard attention remain competitive despite applying fewer changes to the input.- Anthology ID:
- I17-1013
- Volume:
- Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- November
- Year:
- 2017
- Address:
- Taipei, Taiwan
- Venue:
- IJCNLP
- SIG:
- Publisher:
- Asian Federation of Natural Language Processing
- Note:
- Pages:
- 120–129
- Language:
- URL:
- https://aclanthology.org/I17-1013
- DOI:
- Cite (ACL):
- Marcin Junczys-Dowmunt and Roman Grundkiewicz. 2017. An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 120–129, Taipei, Taiwan. Asian Federation of Natural Language Processing.
- Cite (Informal):
- An Exploration of Neural Sequence-to-Sequence Architectures for Automatic Post-Editing (Junczys-Dowmunt & Grundkiewicz, IJCNLP 2017)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/I17-1013.pdf
- Data
- WMT 2016