Online Automatic Post-editing for MT in a Multi-Domain Translation Environment
Rajen Chatterjee, Gebremedhen Gebremelak, Matteo Negri, Marco Turchi
Abstract
Automatic post-editing (APE) for machine translation (MT) aims to fix recurrent errors made by the MT decoder by learning from correction examples. In controlled evaluation scenarios, the representativeness of the training set with respect to the test data is a key factor to achieve good performance. Real-life scenarios, however, do not guarantee such favorable learning conditions. Ideally, to be integrated in a real professional translation workflow (e.g. to play a role in computer-assisted translation framework), APE tools should be flexible enough to cope with continuous streams of diverse data coming from different domains/genres. To cope with this problem, we propose an online APE framework that is: i) robust to data diversity (i.e. capable to learn and apply correction rules in the right contexts) and ii) able to evolve over time (by continuously extending and refining its knowledge). In a comparative evaluation, with English-German test data coming in random order from two different domains, we show the effectiveness of our approach, which outperforms a strong batch system and the state of the art in online APE.- Anthology ID:
- E17-1050
- Volume:
- Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Mirella Lapata, Phil Blunsom, Alexander Koller
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 525–535
- Language:
- URL:
- https://aclanthology.org/E17-1050
- DOI:
- Cite (ACL):
- Rajen Chatterjee, Gebremedhen Gebremelak, Matteo Negri, and Marco Turchi. 2017. Online Automatic Post-editing for MT in a Multi-Domain Translation Environment. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers, pages 525–535, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Online Automatic Post-editing for MT in a Multi-Domain Translation Environment (Chatterjee et al., EACL 2017)
- PDF:
- https://preview.aclanthology.org/landing_page/E17-1050.pdf
- Data
- WMT 2016