Abstract
Terms extensively exist in specific domains, and term translation plays a critical role in domain-specific machine translation (MT) tasks. However, it’s a challenging task to translate them correctly for the huge number of pre-existing terms and the endless new terms. To achieve better term translation quality, it is necessary to inject external term knowledge into the underlying MT system. Fortunately, there are plenty of term translation knowledge in parenthetical sentences on the Internet. In this paper, we propose a simple, straightforward and effective framework to improve term translation by learning from parenthetical sentences. This framework includes: (1) a focused web crawler; (2) a parenthetical sentence filter, acquiring parenthetical sentences including bilingual term pairs; (3) a term translation knowledge extractor, extracting bilingual term translation candidates; (4) a probability learner, generating the term translation table for MT decoders. The extensive experiments demonstrate that our proposed framework significantly improves the translation quality of terms and sentences.- Anthology ID:
- W17-6005
- Volume:
- Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing
- Month:
- December
- Year:
- 2017
- Address:
- Taiwan
- Editors:
- Yue Zhang, Zhifang Sui
- Venue:
- SIGHAN
- SIG:
- SIGHAN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 37–45
- Language:
- URL:
- https://aclanthology.org/W17-6005
- DOI:
- Cite (ACL):
- Guoping Huang, Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2017. Learning from Parenthetical Sentences for Term Translation in Machine Translation. In Proceedings of the 9th SIGHAN Workshop on Chinese Language Processing, pages 37–45, Taiwan. Association for Computational Linguistics.
- Cite (Informal):
- Learning from Parenthetical Sentences for Term Translation in Machine Translation (Huang et al., SIGHAN 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/W17-6005.pdf