Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension

Minjoon Seo, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, Hannaneh Hajishirzi


Abstract
We formalize a new modular variant of current question answering tasks by enforcing complete independence of the document encoder from the question encoder. This formulation addresses a key challenge in machine comprehension by building a standalone representation of the document discourse. It additionally leads to a significant scalability advantage since the encoding of the answer candidate phrases in the document can be pre-computed and indexed offline for efficient retrieval. We experiment with baseline models for the new task, which achieve a reasonable accuracy but significantly underperform unconstrained QA models. We invite the QA research community to engage in Phrase-Indexed Question Answering (PIQA, pika) for closing the gap. The leaderboard is at: nlp.cs.washington.edu/piqa
Anthology ID:
D18-1052
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
559–564
Language:
URL:
https://aclanthology.org/D18-1052
DOI:
10.18653/v1/D18-1052
Bibkey:
Cite (ACL):
Minjoon Seo, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, and Hannaneh Hajishirzi. 2018. Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 559–564, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension (Seo et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/D18-1052.pdf
Video:
 https://vimeo.com/305205055
Code
 uwnlp/piqa