AmbigQA: Answering Ambiguous Open-domain Questions
Sewon Min, Julian Michael, Hannaneh Hajishirzi, Luke Zettlemoyer
Abstract
Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer. In this paper, we introduce AmbigQA, a new open-domain question answering task which involves finding every plausible answer, and then rewriting the question for each one to resolve the ambiguity. To study this task, we construct AmbigNQ, a dataset covering 14,042 questions from NQ-open, an existing open-domain QA benchmark. We find that over half of the questions in NQ-open are ambiguous, with diverse sources of ambiguity such as event and entity references. We also present strong baseline models for AmbigQA which we show benefit from weakly supervised learning that incorporates NQ-open, strongly suggesting our new task and data will support significant future research effort. Our data and baselines are available at https://nlp.cs.washington.edu/ambigqa.- Anthology ID:
- 2020.emnlp-main.466
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5783–5797
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.466
- DOI:
- 10.18653/v1/2020.emnlp-main.466
- Cite (ACL):
- Sewon Min, Julian Michael, Hannaneh Hajishirzi, and Luke Zettlemoyer. 2020. AmbigQA: Answering Ambiguous Open-domain Questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5783–5797, Online. Association for Computational Linguistics.
- Cite (Informal):
- AmbigQA: Answering Ambiguous Open-domain Questions (Min et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2020.emnlp-main.466.pdf
- Code
- additional community code
- Data
- AmbigNQ, Natural Questions