End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Devendra Sachan; Mostofa Patwary; Mohammad Shoeybi; Neel Kant; Wei Ping; William L. Hamilton; Bryan Catanzaro

doi:10.18653/v1/2021.acl-long.519

End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Devendra Sachan, Mostofa Patwary, Mohammad Shoeybi, Neel Kant, Wei Ping, William L. Hamilton, Bryan Catanzaro

Abstract

Recent work on training neural retrievers for open-domain question answering (OpenQA) has employed both supervised and unsupervised approaches. However, it remains unclear how unsupervised and supervised methods can be used most effectively for neural retrievers. In this work, we systematically study retriever pre-training. We first propose an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans, followed by supervised finetuning using question-context pairs. This approach leads to absolute gains of 2+ points over the previous best result in the top-20 retrieval accuracy on Natural Questions and TriviaQA datasets. We next explore two approaches for end-to-end training of the reader and retriever components in OpenQA models, which differ in the manner the reader ingests the retrieved documents. Our experiments demonstrate the effectiveness of these approaches as we obtain state-of-the-art results. On the Natural Questions dataset, we obtain a top-20 retrieval accuracy of 84%, an improvement of 5 points over the recent DPR model. We also achieve good results on answer extraction, outperforming recent models like REALM and RAG by 3+ points.

Anthology ID:: 2021.acl-long.519
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: August
Year:: 2021
Address:: Online
Editors:: Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6648–6662
Language:
URL:: https://aclanthology.org/2021.acl-long.519
DOI:: 10.18653/v1/2021.acl-long.519
Bibkey:
Cite (ACL):: Devendra Sachan, Mostofa Patwary, Mohammad Shoeybi, Neel Kant, Wei Ping, William L. Hamilton, and Bryan Catanzaro. 2021. End-to-End Training of Neural Retrievers for Open-Domain Question Answering. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6648–6662, Online. Association for Computational Linguistics.
Cite (Informal):: End-to-End Training of Neural Retrievers for Open-Domain Question Answering (Sachan et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-2023-videos/2021.acl-long.519.pdf
Video:: https://preview.aclanthology.org/ingest-acl-2023-videos/2021.acl-long.519.mp4
Code: NVIDIA/Megatron-LM + additional community code
Data: Natural Questions, TriviaQA

PDF Search Code Video