Challenges in Generalization in Open Domain Question Answering

Linqing Liu, Patrick Lewis, Sebastian Riedel, Pontus Stenetorp


Abstract
Recent work on Open Domain Question Answering has shown that there is a large discrepancy in model performance between novel test questions and those that largely overlap with training questions. However, it is unclear which aspects of novel questions make them challenging. Drawing upon studies on systematic generalization, we introduce and annotate questions according to three categories that measure different levels and kinds of generalization: training set overlap, compositional generalization (comp-gen), and novel-entity generalization (novel-entity). When evaluating six popular parametric and non-parametric models, we find that for the established Natural Questions and TriviaQA datasets, even the strongest model performance for comp-gen/novel-entity is 13.1/5.4% and 9.6/1.5% lower compared to that for the full test set – indicating the challenge posed by these types of questions. Furthermore, we show that whilst non-parametric models can handle questions containing novel entities relatively well, they struggle with those requiring compositional generalization. Lastly, we find that key question difficulty factors are: cascading errors from the retrieval component, frequency of question pattern, and frequency of the entity.
Anthology ID:
2022.findings-naacl.155
Volume:
Findings of the Association for Computational Linguistics: NAACL 2022
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2014–2029
Language:
URL:
https://aclanthology.org/2022.findings-naacl.155
DOI:
10.18653/v1/2022.findings-naacl.155
Bibkey:
Cite (ACL):
Linqing Liu, Patrick Lewis, Sebastian Riedel, and Pontus Stenetorp. 2022. Challenges in Generalization in Open Domain Question Answering. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2014–2029, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Challenges in Generalization in Open Domain Question Answering (Liu et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2022.findings-naacl.155.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-1/2022.findings-naacl.155.mp4
Code
 likicode/QA-generalize
Data
Natural QuestionsTriviaQAWebQuestions