Text Compression for Efficient Language Generation

David Gu, Peter Belcak, Roger Wattenhofer


Abstract
We challenge the prevailing assumption that LLMs must rely fully on sub-word tokens for high-quality text generation. To this end, we propose the “Generative Pretrained Thoughtformer” (GPTHF), a hierarchical transformer language model capable of text generation by compressing text into sentence embeddings and employing a sentence attention mechanism. GPTHF retains GPT’s architecture, modifying only token interactions via dynamic sparse attention masks. Our experiments show that GPTHF achieves an up to an order of magnitude improvement in FLOPs efficiency and a threefold increase in runtime speed compared to equally-sized GPT models in the low-size regime. This is achieved through a unique generation method that caches and reuses sentence embeddings, allowing significant portions of the input to bypass large parts of the network.
Anthology ID:
2025.naacl-srw.18
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)
Month:
April
Year:
2025
Address:
Albuquerque, USA
Editors:
Abteen Ebrahimi, Samar Haider, Emmy Liu, Sammar Haider, Maria Leonor Pacheco, Shira Wein
Venues:
NAACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
186–192
Language:
URL:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.naacl-srw.18/
DOI:
Bibkey:
Cite (ACL):
David Gu, Peter Belcak, and Roger Wattenhofer. 2025. Text Compression for Efficient Language Generation. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop), pages 186–192, Albuquerque, USA. Association for Computational Linguistics.
Cite (Informal):
Text Compression for Efficient Language Generation (Gu et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.naacl-srw.18.pdf