Mining the Web for Domain-Specific Translations

Jian-Cheng Wu; Peter Wei-Huai Hsu; Chiung-hui Tseng; Jason S. Chang

Mining the Web for Domain-Specific Translations

Jian-Cheng Wu, Peter Wei-Huai Hsu, Chiung-Hui Tseng, Jason S. Chang

Abstract

We introduce a method for learning to find domain-specific translations for a given term on the Web. In our approach, the source term is transformed into an expanded query aimed at maximizing the probability of retrieving translations from a very large collection of mixed-code documents. The method involves automatically generating sets of target-language words from training data in specific domains, automatically selecting target words for effectiveness in retrieving documents containing the sought-after translations. At run time, the given term is transformed into an expanded query and submitted to a search engine, and ranked translations are extracted from the document snippets returned by the search engine. We present a prototype, TermMine, which applies the method to a Web search engine. Evaluations over a set of domains and terms show that TermMine outperforms state-of-the-art machine translation systems.

Anthology ID:: 2008.amta-papers.20
Volume:: Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers
Month:: October 21-25
Year:: 2008
Address:: Waikiki, USA
Venue:: AMTA
SIG:
Publisher:: Association for Machine Translation in the Americas
Note:
Pages:: 212–221
Language:
URL:: https://aclanthology.org/2008.amta-papers.20
DOI:
Bibkey:
Cite (ACL):: Jian-Cheng Wu, Peter Wei-Huai Hsu, Chiung-Hui Tseng, and Jason S. Chang. 2008. Mining the Web for Domain-Specific Translations. In Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers, pages 212–221, Waikiki, USA. Association for Machine Translation in the Americas.
Cite (Informal):: Mining the Web for Domain-Specific Translations (Wu et al., AMTA 2008)
Copy Citation:
PDF:: https://preview.aclanthology.org/revert-3132-ingestion-checklist/2008.amta-papers.20.pdf

PDF Search