Online Automatic Post-editing for MT in a Multi-Domain Translation Environment

Rajen Chatterjee, Gebremedhen Gebremelak, Matteo Negri, Marco Turchi


Abstract
Automatic post-editing (APE) for machine translation (MT) aims to fix recurrent errors made by the MT decoder by learning from correction examples. In controlled evaluation scenarios, the representativeness of the training set with respect to the test data is a key factor to achieve good performance. Real-life scenarios, however, do not guarantee such favorable learning conditions. Ideally, to be integrated in a real professional translation workflow (e.g. to play a role in computer-assisted translation framework), APE tools should be flexible enough to cope with continuous streams of diverse data coming from different domains/genres. To cope with this problem, we propose an online APE framework that is: i) robust to data diversity (i.e. capable to learn and apply correction rules in the right contexts) and ii) able to evolve over time (by continuously extending and refining its knowledge). In a comparative evaluation, with English-German test data coming in random order from two different domains, we show the effectiveness of our approach, which outperforms a strong batch system and the state of the art in online APE.
Anthology ID:
E17-1050
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
525–535
Language:
URL:
https://aclanthology.org/E17-1050
DOI:
Bibkey:
Cite (ACL):
Rajen Chatterjee, Gebremedhen Gebremelak, Matteo Negri, and Marco Turchi. 2017. Online Automatic Post-editing for MT in a Multi-Domain Translation Environment. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 525–535, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Online Automatic Post-editing for MT in a Multi-Domain Translation Environment (Chatterjee et al., EACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/E17-1050.pdf
Data
WMT 2016