MeSH-based dataset for measuring the relevance of text retrieval
Won Gyu Kim, Lana Yeganova, Donald Comeau, W John Wilbur, Zhiyong Lu
Abstract
Creating simulated search environments has been of a significant interest in infor-mation retrieval, in both general and bio-medical search domains. Existing collec-tions include modest number of queries and are constructed by manually evaluat-ing retrieval results. In this work we pro-pose leveraging MeSH term assignments for creating synthetic test beds. We select a suitable subset of MeSH terms as queries, and utilize MeSH term assignments as pseudo-relevance rankings for retrieval evaluation. Using well studied retrieval functions, we show that their performance on the proposed data is consistent with similar findings in previous work. We further use the proposed retrieval evaluation framework to better understand how to combine heterogeneous sources of textual information.- Anthology ID:
- W18-2320
- Volume:
- Proceedings of the BioNLP 2018 workshop
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Venue:
- BioNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 161–165
- Language:
- URL:
- https://aclanthology.org/W18-2320
- DOI:
- 10.18653/v1/W18-2320
- Cite (ACL):
- Won Gyu Kim, Lana Yeganova, Donald Comeau, W John Wilbur, and Zhiyong Lu. 2018. MeSH-based dataset for measuring the relevance of text retrieval. In Proceedings of the BioNLP 2018 workshop, pages 161–165, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- MeSH-based dataset for measuring the relevance of text retrieval (Kim et al., BioNLP 2018)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/W18-2320.pdf