UNIFIEDQA: Crossing Format Boundaries with a Single QA System
Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, Hannaneh Hajishirzi
Abstract
Question answering (QA) tasks have been posed using a variety of formats, such as extractive span selection, multiple choice, etc. This has led to format-specialized models, and even to an implicit division in the QA community. We argue that such boundaries are artificial and perhaps unnecessary, given the reasoning abilities we seek to teach are not governed by the format. As evidence, we use the latest advances in language modeling to build a single pre-trained QA model, UNIFIEDQA, that performs well across 19 QA datasets spanning 4 diverse formats. UNIFIEDQA performs on par with 8 different models that were trained on individual datasets themselves. Even when faced with 12 unseen datasets of observed formats, UNIFIEDQA performs surprisingly well, showing strong generalization from its outof-format training data. Finally, simply finetuning this pre trained QA model into specialized models results in a new state of the art on 10 factoid and commonsense question answering datasets, establishing UNIFIEDQA as a strong starting point for building QA systems.- Anthology ID:
- 2020.findings-emnlp.171
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Trevor Cohn, Yulan He, Yang Liu
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1896–1907
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2020.findings-emnlp.171/
- DOI:
- 10.18653/v1/2020.findings-emnlp.171
- Cite (ACL):
- Daniel Khashabi, Sewon Min, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Clark, and Hannaneh Hajishirzi. 2020. UNIFIEDQA: Crossing Format Boundaries with a Single QA System. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1896–1907, Online. Association for Computational Linguistics.
- Cite (Informal):
- UNIFIEDQA: Crossing Format Boundaries with a Single QA System (Khashabi et al., Findings 2020)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2020.findings-emnlp.171.pdf
- Code
- allenai/unifiedqa + additional community code
- Data
- ARC (AI2 Reasoning Challenge), BoolQ, CommonsenseQA, DROP, MCTest, MMLU, MultiRC, NarrativeQA, NewsQA, OpenBookQA, PIQA, QASC, Quoref, RACE, ROPES, SIQA, SQuAD, WinoGrande