Flashback: Memory Mechanism for Enhancing Memory Efficiency and Speed in Deep Sequential Models

Taiki Sekii

Flashback: Memory Mechanism for Enhancing Memory Efficiency and Speed in Deep Sequential Models

Abstract

In this study, we tackle three main challenges of deep sequential processing models in previous research: (1) memory degradation, (2) inaccurate gradient backpropagation, and (3) compatibility with next-token prediction. Specifically, to address (1-2), we define a Flashback property in which memory is preserved perfectly as an identity mapping of its stored value in a memory region until it is overwritten by a hidden state at a different time step. We propose a Flashback mechanism that satisfies this property in a fully differentiable, end-to-end manner. Further, to tackle (3), we propose architectures that incorporate the Flashback mechanism into Transformers and Mamba, enabling next-token prediction for language modeling tasks. In experiments, we trained on The Pile dataset, which includes diverse texts, to evaluate tradeoffs between commonsense reasoning accuracy, processing speed, and memory usage after introducing the Flashback mechanism into existing methods. The evaluations confirmed the effectiveness of the Flashback mechanism.

Anthology ID:: 2025.coling-main.575
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8602–8611
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.575/
DOI:
Bibkey:
Cite (ACL):: Taiki Sekii. 2025. Flashback: Memory Mechanism for Enhancing Memory Efficiency and Speed in Deep Sequential Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 8602–8611, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Flashback: Memory Mechanism for Enhancing Memory Efficiency and Speed in Deep Sequential Models (Sekii, COLING 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.575.pdf

PDF Cite Search Fix data