Mung-Gil Jang


1999

pdf
Complementing dictionary-based query translations with corpus statistics for cross-language IR
Sung Hyon Myaeng | Mung-Gil Jang
Proceedings of Machine Translation Summit VII

For cross-language information retrieval (CLIR), often queries or documents are translated into the other language to create a mono-lingual information retrieval situation. Having surveyed recent research results on translation-based CLIR, we have convinced ourselves that an effective query translation method is an essential element for a practical CLIR system with a reasonable quality. After summarizing the arguments and methods for query translation and survey results for dictionary-based translation methods, this paper describes a relatively simple yet effective method of using mutual information to handle the ambiguity problem known to be the major factor for low performance compared to mono-lingual situation. Our experimental results based on the TREC-6 collection shows that this method can achieve up to 85% of the monolingual retrieval case and 96% of the manual disambiguation case.