Incremental Re-Training of a Hybrid English-French MT System with Customer Translation Memory Data

Evgeny Matusov


Abstract
In this paper, we present SAIC’s hybrid machine translation (MT) system and show how it was adapted to the needs of our customer – a major global fashion company. The adaptation was performed in two ways: off-line selection of domain-relevant parallel and monolingual data from a background database, as well as on-line incremental adaptation with customer parallel and translation memory data. The translation memory was integrated into the statistical search using two novel features. We show that these features can be used to produce nearly perfect translations of data that fully or to a large extent partially matches the TM entries, without sacrificing on the translation quality of the data without TM matches. We also describe how the human post-editing effort was reduced due to significantly better MT quality after adaptation, but also due to improved formatting and readability of the MT output.
Anthology ID:
2012.amta-commercial.11
Volume:
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Commercial MT User Program
Month:
October 28-November 1
Year:
2012
Address:
San Diego, California, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
Language:
URL:
https://aclanthology.org/2012.amta-commercial.11
DOI:
Bibkey:
Cite (ACL):
Evgeny Matusov. 2012. Incremental Re-Training of a Hybrid English-French MT System with Customer Translation Memory Data. In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Commercial MT User Program, San Diego, California, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Incremental Re-Training of a Hybrid English-French MT System with Customer Translation Memory Data (Matusov, AMTA 2012)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2012.amta-commercial.11.pdf