Long Document Summarization with Top-down and Bottom-up Inference
Bo Pang, Erik Nijkamp, Wojciech Kryscinski, Silvio Savarese, Yingbo Zhou, Caiming Xiong
Abstract
Text summarization aims to condense long documents and retain key information. Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents. Most recent models infer the latent representations with a transformer encoder, which is purely bottom-up and thus does not capture long-distance context well. Also, self-attention-based models face the challenge of quadratic complexity with respect to sequence length. We propose a method to improve summarization models on these two aspects. Our method assumes a hierarchical latent structure of a document where the top-level captures the long range dependency at a coarser time scale and the bottom token level preserves the details. Critically, our method enables token representations to be updated in both a bottom-up and top-down manner. In the bottom-up pass, token representations are inferred with local self-attention to leverage its efficiency. Top-down correction is then applied to allow tokens to capture global context. We demonstrate the effectiveness on a diverse set of summarization datasets, including narrative, conversational, scientific documents and news. Our model achieves state-of-the-art performance on a wide range of long document summarization benchmarks, compared to recent efficient transformers. We show that our model can summarize an entire book and achieve competitive performance using 0.27% parameters and much less training data, compared to a recent GPT-3-based model. These results indicate the general applicability and benefits of the framework.- Anthology ID:
- 2023.findings-eacl.94
- Volume:
- Findings of the Association for Computational Linguistics: EACL 2023
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1267–1284
- Language:
- URL:
- https://aclanthology.org/2023.findings-eacl.94
- DOI:
- Cite (ACL):
- Bo Pang, Erik Nijkamp, Wojciech Kryscinski, Silvio Savarese, Yingbo Zhou, and Caiming Xiong. 2023. Long Document Summarization with Top-down and Bottom-up Inference. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1267–1284, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Long Document Summarization with Top-down and Bottom-up Inference (Pang et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2023.findings-eacl.94.pdf