Long Document Summarization with Top-down and Bottom-up Inference

Bo Pang; Erik Nijkamp; Wojciech Kryściński; Silvio Savarese; Yingbo Zhou; Caiming Xiong

Long Document Summarization with Top-down and Bottom-up Inference

Bo Pang, Erik Nijkamp, Wojciech Kryscinski, Silvio Savarese, Yingbo Zhou, Caiming Xiong

Abstract

Text summarization aims to condense long documents and retain key information. Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents. Most recent models infer the latent representations with a transformer encoder, which is purely bottom-up and thus does not capture long-distance context well. Also, self-attention-based models face the challenge of quadratic complexity with respect to sequence length. We propose a method to improve summarization models on these two aspects. Our method assumes a hierarchical latent structure of a document where the top-level captures the long range dependency at a coarser time scale and the bottom token level preserves the details. Critically, our method enables token representations to be updated in both a bottom-up and top-down manner. In the bottom-up pass, token representations are inferred with local self-attention to leverage its efficiency. Top-down correction is then applied to allow tokens to capture global context. We demonstrate the effectiveness on a diverse set of summarization datasets, including narrative, conversational, scientific documents and news. Our model achieves state-of-the-art performance on a wide range of long document summarization benchmarks, compared to recent efficient transformers. We show that our model can summarize an entire book and achieve competitive performance using 0.27% parameters and much less training data, compared to a recent GPT-3-based model. These results indicate the general applicability and benefits of the framework.

Anthology ID:: 2023.findings-eacl.94
Volume:: Findings of the Association for Computational Linguistics: EACL 2023
Month:: May
Year:: 2023
Address:: Dubrovnik, Croatia
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1267–1284
Language:
URL:: https://aclanthology.org/2023.findings-eacl.94
DOI:
Bibkey:
Cite (ACL):: Bo Pang, Erik Nijkamp, Wojciech Kryscinski, Silvio Savarese, Yingbo Zhou, and Caiming Xiong. 2023. Long Document Summarization with Top-down and Bottom-up Inference. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1267–1284, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):: Long Document Summarization with Top-down and Bottom-up Inference (Pang et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/starsem-semeval-split/2023.findings-eacl.94.pdf

PDF Search