Fuzzy-match repair using black-box machine translation systems: what can be expected?

John Ortega, Felipe Sánchez-Martínez, Mikel Forcada


Abstract
Computer-aided translation (CAT) tools often use a translation memory (TM) as the key resource to assist translators. A TM contains translation units (TU) which are made up of source and target language segments; translators use the target segments in the TU suggested by the CAT tool by converting them into the desired translation. Proposals from TMs could be made more useful by using techniques such as fuzzy-match repair (FMR) which modify words in the target segment corresponding to mismatches identified in the source segment. Modifications in the target segment are done by translating the mismatched source sub-segments using an external source of bilingual information (SBI) and applying the translations to the corresponding positions in the target segment. Several combinations of translated sub-segments can be applied to the target segment which can produce multiple repair candidates. We provide a formal algorithmic description of a method that is capable of using any SBI to generate all possible fuzzy-match repairs and perform an oracle evaluation on three different language pairs to ascertain the potential of the method to improve translation productivity. Using DGT-TM translation memories and the machine system Apertium as the single source to build repair operators in three different language pairs, we show that the best repaired fuzzy matches are consistently closer to reference translations than either machine-translated segments or unrepaired fuzzy matches.
Anthology ID:
2016.amta-researchers.3
Volume:
Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track
Month:
October 28 - November 1
Year:
2016
Address:
Austin, TX, USA
Venue:
AMTA
SIG:
Publisher:
The Association for Machine Translation in the Americas
Note:
Pages:
27–39
Language:
URL:
https://aclanthology.org/2016.amta-researchers.3
DOI:
Bibkey:
Cite (ACL):
John Ortega, Felipe Sánchez-Martínez, and Mikel Forcada. 2016. Fuzzy-match repair using black-box machine translation systems: what can be expected?. In Conferences of the Association for Machine Translation in the Americas: MT Researchers' Track, pages 27–39, Austin, TX, USA. The Association for Machine Translation in the Americas.
Cite (Informal):
Fuzzy-match repair using black-box machine translation systems: what can be expected? (Ortega et al., AMTA 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2016.amta-researchers.3.pdf