Span Selection Pre-training for Question Answering

Michael Glass; Alfio Gliozzo; Rishav Chakravarti; Anthony Ferritto; Lin Pan; G P Shrivatsa Bhargav; Dinesh Garg; Avirup Sil

doi:10.18653/v1/2020.acl-main.247

Span Selection Pre-training for Question Answering

Michael Glass, Alfio Gliozzo, Rishav Chakravarti, Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avi Sil

Abstract

BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pretrained on two auxiliary tasks: Masked Language Model and Next Sentence Prediction. In this paper we introduce a new pre-training task inspired by reading comprehension to better align the pre-training from memorization to understanding. Span Selection PreTraining (SSPT) poses cloze-like training instances, but rather than draw the answer from the model’s parameters, it is selected from a relevant passage. We find significant and consistent improvements over both BERT-BASE and BERT-LARGE on multiple Machine Reading Comprehension (MRC) datasets. Specifically, our proposed model has strong empirical evidence as it obtains SOTA results on Natural Questions, a new benchmark MRC dataset, outperforming BERT-LARGE by 3 F1 points on short answer prediction. We also show significant impact in HotpotQA, improving answer prediction F1 by 4 points and supporting fact prediction F1 by 1 point and outperforming the previous best system. Moreover, we show that our pre-training approach is particularly effective when training data is limited, improving the learning curve by a large amount.

Anthology ID:: 2020.acl-main.247
Volume:: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2020
Address:: Online
Editors:: Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2773–2782
Language:
URL:: https://aclanthology.org/2020.acl-main.247
DOI:: 10.18653/v1/2020.acl-main.247
Bibkey:
Cite (ACL):: Michael Glass, Alfio Gliozzo, Rishav Chakravarti, Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, and Avi Sil. 2020. Span Selection Pre-training for Question Answering. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2773–2782, Online. Association for Computational Linguistics.
Cite (Informal):: Span Selection Pre-training for Question Answering (Glass et al., ACL 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2020.acl-main.247.pdf
Video:: http://slideslive.com/38929129
Code: IBM/span-selection-pretraining
Data: HotpotQA, Natural Questions, SQuAD

PDF Search Code Video