ReadTwice: Reading Very Large Documents with Memories

Yury Zemlyanskiy; Joshua Ainslie; Michiel de Jong; Philip Pham; Ilya Eckstein; Fei Sha

doi:10.18653/v1/2021.naacl-main.408

ReadTwice: Reading Very Large Documents with Memories

Yury Zemlyanskiy, Joshua Ainslie, Michiel de Jong, Philip Pham, Ilya Eckstein, Fei Sha

Abstract

Knowledge-intensive tasks such as question answering often require assimilating information from different sections of large inputs such as books or article collections. We propose ReadTwice, a simple and effective technique that combines several strengths of prior approaches to model long-range dependencies with Transformers. The main idea is to read text in small segments, in parallel, summarizing each segment into a memory table to be used in a second read of the text. We show that the method outperforms models of comparable size on several question answering (QA) datasets and sets a new state of the art on the challenging NarrativeQA task, with questions about entire books.

Anthology ID:: 2021.naacl-main.408
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: June
Year:: 2021
Address:: Online
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5189–5195
Language:
URL:: https://aclanthology.org/2021.naacl-main.408
DOI:: 10.18653/v1/2021.naacl-main.408
Bibkey:
Cite (ACL):: Yury Zemlyanskiy, Joshua Ainslie, Michiel de Jong, Philip Pham, Ilya Eckstein, and Fei Sha. 2021. ReadTwice: Reading Very Large Documents with Memories. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5189–5195, Online. Association for Computational Linguistics.
Cite (Informal):: ReadTwice: Reading Very Large Documents with Memories (Zemlyanskiy et al., NAACL 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-script-update/2021.naacl-main.408.pdf
Video:: https://preview.aclanthology.org/ingestion-script-update/2021.naacl-main.408.mp4
Data: HotpotQA, NarrativeQA, TriviaQA

PDF Search Video