Abstract
Question answering (QA) systems need to provide exact answers for the questions that are posed to the system. However, this can only be achieved through a precise processing of the question. During this procedure, one important step is the detection of the expected type of answer that the system should provide by extracting the headword of the questions and identifying its semantic type. We have annotated the headword and assigned UMLS semantic types to 643 factoid/list questions from the BioASQ training data. We present statistics on the corpus and a preliminary evaluation in baseline experiments. We also discuss the challenges on both the manual annotation and the automatic detection of the headwords and the semantic types. We believe that this is a valuable resource for both training and evaluation of biomedical QA systems. The corpus is available at: https://github.com/mariananeves/BioMedLAT.- Anthology ID:
- W16-4407
- Volume:
- Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Key-Sun Choi, Christina Unger, Piek Vossen, Jin-Dong Kim, Noriko Kando, Axel-Cyrille Ngonga Ngomo
- Venue:
- WS
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 49–58
- Language:
- URL:
- https://aclanthology.org/W16-4407
- DOI:
- Cite (ACL):
- Mariana Neves and Milena Kraus. 2016. BioMedLAT Corpus: Annotation of the Lexical Answer Type for Biomedical Questions. In Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016), pages 49–58, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- BioMedLAT Corpus: Annotation of the Lexical Answer Type for Biomedical Questions (Neves & Kraus, 2016)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/W16-4407.pdf
- Code
- mariananeves/BioMedLAT
- Data
- BioASQ