Abstract
For cross-language information retrieval (CLIR), often queries or documents are translated into the other language to create a mono-lingual information retrieval situation. Having surveyed recent research results on translation-based CLIR, we have convinced ourselves that an effective query translation method is an essential element for a practical CLIR system with a reasonable quality. After summarizing the arguments and methods for query translation and survey results for dictionary-based translation methods, this paper describes a relatively simple yet effective method of using mutual information to handle the ambiguity problem known to be the major factor for low performance compared to mono-lingual situation. Our experimental results based on the TREC-6 collection shows that this method can achieve up to 85% of the monolingual retrieval case and 96% of the manual disambiguation case.- Anthology ID:
- 1999.mtsummit-1.25
- Volume:
- Proceedings of Machine Translation Summit VII
- Month:
- September 13-17
- Year:
- 1999
- Address:
- Singapore, Singapore
- Venue:
- MTSummit
- SIG:
- Publisher:
- Note:
- Pages:
- 165–174
- Language:
- URL:
- https://aclanthology.org/1999.mtsummit-1.25
- DOI:
- Cite (ACL):
- Sung Hyon Myaeng and Mung-Gil Jang. 1999. Complementing dictionary-based query translations with corpus statistics for cross-language IR. In Proceedings of Machine Translation Summit VII, pages 165–174, Singapore, Singapore.
- Cite (Informal):
- Complementing dictionary-based query translations with corpus statistics for cross-language IR (Myaeng & Jang, MTSummit 1999)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/1999.mtsummit-1.25.pdf