Abstract
State-of-the-art systems in deep question answering proceed as follows: (1)an initial document retrieval selects relevant documents, which (2) are then processed by a neural network in order to extract the final answer. Yet the exact interplay between both components is poorly understood, especially concerning the number of candidate documents that should be retrieved. We show that choosing a static number of documents - as used in prior research - suffers from a noise-information trade-off and yields suboptimal results. As a remedy, we propose an adaptive document retrieval model. This learns the optimal candidate number for document retrieval, conditional on the size of the corpus and the query. We report extensive experimental results showing that our adaptive approach outperforms state-of-the-art methods on multiple benchmark datasets, as well as in the context of corpora with variable sizes.- Anthology ID:
- D18-1055
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 576–581
- Language:
- URL:
- https://aclanthology.org/D18-1055
- DOI:
- 10.18653/v1/D18-1055
- Cite (ACL):
- Bernhard Kratzwald and Stefan Feuerriegel. 2018. Adaptive Document Retrieval for Deep Question Answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 576–581, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Adaptive Document Retrieval for Deep Question Answering (Kratzwald & Feuerriegel, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/D18-1055.pdf
- Code
- bernhard2202/adaptive-ir-for-qa
- Data
- SQuAD, WebQuestions, WikiMovies