LOCOST: State-Space Models for Long Document Abstractive Summarization
Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari
Abstract
State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of đť’Ş(L log L), this architecture can handle significantly longer sequences than state-of-the-art models that are based on sparse attention patterns. We evaluate our model on a series of long document abstractive summarization tasks. The model reaches a performance level that is 93-96% comparable to the top-performing sparse transformers of the same size while saving up to 50% memory during training and up to 87% during inference. Additionally, LOCOST effectively handles input texts exceeding 600K tokens at inference time, setting new state-of-the-art results on full-book summarization and opening new perspectives for long input processing.- Anthology ID:
- 2024.eacl-long.69
- Original:
- 2024.eacl-long.69v1
- Version 2:
- 2024.eacl-long.69v2
- Volume:
- Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1144–1159
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2024.eacl-long.69/
- DOI:
- Award:
- Best Paper Award
- Cite (ACL):
- Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, and Patrick Gallinari. 2024. LOCOST: State-Space Models for Long Document Abstractive Summarization. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1144–1159, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- LOCOST: State-Space Models for Long Document Abstractive Summarization (Le Bronnec et al., EACL 2024)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2024.eacl-long.69.pdf