A comparative study of query and document translation for cross-language information retrieval

Douglas W. Oard


Abstract
Cross-language retrieval systems use queries in one natural language to guide retrieval of documents that might be written in another. Acquisition and representation of translation knowledge plays a central role in this process. This paper explores the utility of two sources of translation knowledge for cross-language retrieval. We have implemented six query translation techniques that use bilingual term lists and one based on direct use of the translation output from an existing machine translation system; these are compared with a document translation technique that uses output from the same machine translation system. Average precision measures on a TREC collection suggest that arbitrarily selecting a single dictionary translation is typically no less effective than using every translation in the dictionary, that query translation using a machine translation system can achieve somewhat better effectiveness than simpler techniques, and that document translation may result in further improvements in retrieval effectiveness under some conditions.
Anthology ID:
1998.amta-papers.41
Volume:
Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
October 28-31
Year:
1998
Address:
Langhorne, PA, USA
Venue:
AMTA
SIG:
Publisher:
Springer
Note:
Pages:
472–483
Language:
URL:
https://link.springer.com/chapter/10.1007/3-540-49478-2_42
DOI:
Bibkey:
Cite (ACL):
Douglas W. Oard. 1998. A comparative study of query and document translation for cross-language information retrieval. In Proceedings of the Third Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 472–483, Langhorne, PA, USA. Springer.
Cite (Informal):
A comparative study of query and document translation for cross-language information retrieval (Oard, AMTA 1998)
Copy Citation:
PDF:
https://link.springer.com/chapter/10.1007/3-540-49478-2_42