MultiReQA: A Cross-Domain Evaluation forRetrieval Question Answering Models
Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, Noah Constant
Abstract
Retrieval question answering (ReQA) is the task of retrieving a sentence-level answer to a question from an open corpus (Ahmad et al.,2019).This dataset paper presents MultiReQA, a new multi-domain ReQA evaluation suite composed of eight retrieval QA tasks drawn from publicly available QA datasets. We explore systematic retrieval based evaluation and transfer learning across domains over these datasets using a number of strong base-lines including two supervised neural models, based on fine-tuning BERT and USE-QA models respectively, as well as a surprisingly effective information retrieval baseline, BM25. Five of these tasks contain both training and test data, while three contain test data only. Performing cross training on the five tasks with training data shows that while a general model covering all domains is achievable, the best performance is often obtained by training exclusively on in-domain data.- Anthology ID:
- 2021.adaptnlp-1.10
- Volume:
- Proceedings of the Second Workshop on Domain Adaptation for NLP
- Month:
- April
- Year:
- 2021
- Address:
- Kyiv, Ukraine
- Editors:
- Eyal Ben-David, Shay Cohen, Ryan McDonald, Barbara Plank, Roi Reichart, Guy Rotman, Yftah Ziser
- Venue:
- AdaptNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 94–104
- Language:
- URL:
- https://aclanthology.org/2021.adaptnlp-1.10
- DOI:
- Cite (ACL):
- Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, and Noah Constant. 2021. MultiReQA: A Cross-Domain Evaluation forRetrieval Question Answering Models. In Proceedings of the Second Workshop on Domain Adaptation for NLP, pages 94–104, Kyiv, Ukraine. Association for Computational Linguistics.
- Cite (Informal):
- MultiReQA: A Cross-Domain Evaluation forRetrieval Question Answering Models (Guo et al., AdaptNLP 2021)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2021.adaptnlp-1.10.pdf
- Code
- google-research-datasets/MultiReQA
- Data
- BioASQ, HotpotQA, MRQA, Natural Questions, ReQA, SQuAD, SearchQA, TriviaQA