Learning from Parenthetical Sentences for Term Translation in Machine Translation

Guoping Huang, Jiajun Zhang, Yu Zhou, Chengqing Zong


Abstract
Terms extensively exist in specific domains, and term translation plays a critical role in domain-specific machine translation (MT) tasks. However, it’s a challenging task to translate them correctly for the huge number of pre-existing terms and the endless new terms. To achieve better term translation quality, it is necessary to inject external term knowledge into the underlying MT system. Fortunately, there are plenty of term translation knowledge in parenthetical sentences on the Internet. In this paper, we propose a simple, straightforward and effective framework to improve term translation by learning from parenthetical sentences. This framework includes: (1) a focused web crawler; (2) a parenthetical sentence filter, acquiring parenthetical sentences including bilingual term pairs; (3) a term translation knowledge extractor, extracting bilingual term translation candidates; (4) a probability learner, generating the term translation table for MT decoders. The extensive experiments demonstrate that our proposed framework significantly improves the translation quality of terms and sentences.
Anthology ID:
W17-6005
Volume:
Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing
Month:
December
Year:
2017
Address:
Taiwan
Venue:
SIGHAN
SIG:
SIGHAN
Publisher:
Association for Computational Linguistics
Note:
Pages:
37–45
Language:
URL:
https://aclanthology.org/W17-6005
DOI:
Bibkey:
Cite (ACL):
Guoping Huang, Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2017. Learning from Parenthetical Sentences for Term Translation in Machine Translation. In Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing, pages 37–45, Taiwan. Association for Computational Linguistics.
Cite (Informal):
Learning from Parenthetical Sentences for Term Translation in Machine Translation (Huang et al., SIGHAN 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W17-6005.pdf