Document retrieval and question answering in medical documents. A large-scale corpus challenge.

Curea Eric

doi:10.26615/978-954-452-044-1_001

Document retrieval and question answering in medical documents. A large-scale corpus challenge.

Abstract

Whenever employed on large datasets, information retrieval works by isolating a subset of documents from the larger dataset and then proceeding with low-level processing of the text. This is usually carried out by means of adding index-terms to each document in the collection. In this paper we deal with automatic document classification and index-term detection applied on large-scale medical corpora. In our methodology we employ a linear classifier and we test our results on the BioASQ training corpora, which is a collection of 12 million MeSH-indexed medical abstracts. We cover both term-indexing, result retrieval and result ranking based on distributed word representations.

Anthology ID:: W17-8001
Volume:: Proceedings of the Biomedical NLP Workshop associated with RANLP 2017
Month:: September
Year:: 2017
Address:: Varna, Bulgaria
Venue:: RANLP
SIG:
Publisher:: INCOMA Ltd.
Note:
Pages:: 1–7
Language:
URL:: https://doi.org/10.26615/978-954-452-044-1_001
DOI:: 10.26615/978-954-452-044-1_001
Bibkey:
Cite (ACL):: Curea Eric. 2017. Document retrieval and question answering in medical documents. A large-scale corpus challenge.. In Proceedings of the Biomedical NLP Workshop associated with RANLP 2017, pages 1–7, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):: Document retrieval and question answering in medical documents. A large-scale corpus challenge. (Eric, RANLP 2017)
Copy Citation:
PDF:: https://doi.org/10.26615/978-954-452-044-1_001

PDF Search