Automatic Post-Editing for Vietnamese

Thanh Vu, Dai Quoc Nguyen


Abstract
Automatic post-editing (APE) is an important remedy for reducing errors of raw translated texts that are produced by machine translation (MT) systems or software-aided translation. In this paper, we present a systematic approach to tackle the APE task for Vietnamese. Specifically, we construct the first large-scale dataset of 5M Vietnamese translated and corrected sentence pairs. We then apply strong neural MT models to handle the APE task, using our constructed dataset. Experimental results from both automatic and human evaluations show the effectiveness of the neural MT models in handling the Vietnamese APE task.
Anthology ID:
2021.alta-1.18
Volume:
Proceedings of the The 19th Annual Workshop of the Australasian Language Technology Association
Month:
December
Year:
2021
Address:
Online
Venue:
ALTA
SIG:
Publisher:
Australasian Language Technology Association
Note:
Pages:
169–173
Language:
URL:
https://aclanthology.org/2021.alta-1.18
DOI:
Bibkey:
Cite (ACL):
Thanh Vu and Dai Quoc Nguyen. 2021. Automatic Post-Editing for Vietnamese. In Proceedings of the The 19th Annual Workshop of the Australasian Language Technology Association, pages 169–173, Online. Australasian Language Technology Association.
Cite (Informal):
Automatic Post-Editing for Vietnamese (Vu & Nguyen, ALTA 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.alta-1.18.pdf
Code
 tienthanhdhcn/VnAPE