Query selection methods for automated corpora construction with a use case in food-drug interactions
Georgeta Bordea, Tsanta Randriatsitohaina, Fleur Mougin, Natalia Grabar, Thierry Hamon
Abstract
In this paper, we address the problem of automatically constructing a relevant corpus of scientific articles about food-drug interactions. There is a growing number of scientific publications that describe food-drug interactions but currently building a high-coverage corpus that can be used for information extraction purposes is not trivial. We investigate several methods for automating the query selection process using an expert-curated corpus of food-drug interactions. Our experiments show that index term features along with a decision tree classifier are the best approach for this task and that feature selection approaches and in particular gain ratio outperform frequency-based methods for query selection.- Anthology ID:
- W19-5013
- Volume:
- Proceedings of the 18th BioNLP Workshop and Shared Task
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, Junichi Tsujii
- Venue:
- BioNLP
- SIG:
- SIGBIOMED
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 115–124
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/W19-5013/
- DOI:
- 10.18653/v1/W19-5013
- Cite (ACL):
- Georgeta Bordea, Tsanta Randriatsitohaina, Fleur Mougin, Natalia Grabar, and Thierry Hamon. 2019. Query selection methods for automated corpora construction with a use case in food-drug interactions. In Proceedings of the 18th BioNLP Workshop and Shared Task, pages 115–124, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Query selection methods for automated corpora construction with a use case in food-drug interactions (Bordea et al., BioNLP 2019)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/W19-5013.pdf