Translation Memory Systems Have a Long Way to Go

Andrea Silvestre Baquero, Ruslan Mitkov


Abstract
The TM memory systems changed the work of translators and now the translators not benefiting from these tools are a tiny minority. These tools operate on fuzzy (surface) matching mostly and cannot benefit from already translated texts which are synonymous to (or paraphrased versions of) the text to be translated. The match score is mostly based on character-string similarity, calculated through Levenshtein distance. The TM tools have difficulties with detecting similarities even in sentences which represent a minor revision of sentences already available in the translation memory. This shortcoming of the current TM systems was the subject of the present study and was empirically proven in the experiments we conducted. To this end, we compiled a small translation memory (English-Spanish) and applied several lexical and syntactic transformation rules to the source sentences with both English and Spanish being the source language. The results of this study show that current TM systems have a long way to go and highlight the need for TM systems equipped with NLP capabilities which will offer the translator the advantage of he/she not having to translate a sentence again if an almost identical sentence has already been already translated.
Anthology ID:
W17-7906
Volume:
Proceedings of the Workshop Human-Informed Translation and Interpreting Technology
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Editors:
Irina Temnikova, Constantin Orasan, Gloria Corpas Pastor, Stephan Vogel
Venue:
RANLP
SIG:
Publisher:
Association for Computational Linguistics, Shoumen, Bulgaria
Note:
Pages:
44–51
Language:
URL:
https://doi.org/10.26615/978-954-452-042-7_006
DOI:
10.26615/978-954-452-042-7_006
Bibkey:
Cite (ACL):
Andrea Silvestre Baquero and Ruslan Mitkov. 2017. Translation Memory Systems Have a Long Way to Go. In Proceedings of the Workshop Human-Informed Translation and Interpreting Technology, pages 44–51, Varna, Bulgaria. Association for Computational Linguistics, Shoumen, Bulgaria.
Cite (Informal):
Translation Memory Systems Have a Long Way to Go (Silvestre Baquero & Mitkov, RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-042-7_006