Abstract
Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits. APE systems are usually trained by complementing human post-edited data with large, artificial data generated through back-translations, a time-consuming process often no easier than training a MT system from scratch. in this paper, we propose an alternative where we fine-tune pre-trained BERT models on both the encoder and decoder of an APE system, exploring several parameter sharing strategies. By only training on a dataset of 23K sentences for 3 hours on a single GPU we obtain results that are competitive with systems that were trained on 5M artificial sentences. When we add this artificial data our method obtains state-of-the-art results.- Anthology ID:
- P19-1292
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3050–3056
- Language:
- URL:
- https://aclanthology.org/P19-1292
- DOI:
- 10.18653/v1/P19-1292
- Cite (ACL):
- Gonçalo M. Correia and André F. T. Martins. 2019. A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3050–3056, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning (Correia & Martins, ACL 2019)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/P19-1292.pdf
- Code
- deep-spin/OpenNMT-APE
- Data
- eSCAPE