LOCOST: State-Space Models for Long Document Abstractive Summarization

Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari


Abstract
State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of đť’Ş(L log L), this architecture can handle significantly longer sequences than state-of-the-art models that are based on sparse attention patterns. We evaluate our model on a series of long document abstractive summarization tasks. The model reaches a performance level that is 93-96% comparable to the top-performing sparse transformers of the same size while saving up to 50% memory during training and up to 87% during inference. Additionally, LOCOST effectively handles input texts exceeding 600K tokens at inference time, setting new state-of-the-art results on full-book summarization and opening new perspectives for long input processing.
Anthology ID:
2024.eacl-long.69
Original:
2024.eacl-long.69v1
Version 2:
2024.eacl-long.69v2
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1144–1159
Language:
URL:
https://preview.aclanthology.org/icon-24-ingestion/2024.eacl-long.69/
DOI:
Award:
 Best Paper Award
Bibkey:
Cite (ACL):
Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, and Patrick Gallinari. 2024. LOCOST: State-Space Models for Long Document Abstractive Summarization. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1144–1159, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
LOCOST: State-Space Models for Long Document Abstractive Summarization (Le Bronnec et al., EACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/icon-24-ingestion/2024.eacl-long.69.pdf
Video:
 https://preview.aclanthology.org/icon-24-ingestion/2024.eacl-long.69.mp4