Elliot Nelson
2025
EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts
Subhajit Chaudhury
|
Payel Das
|
Sarathkrishna Swaminathan
|
Georgios Kollias
|
Elliot Nelson
|
Khushbu Pahwa
|
Tejaswini Pedapati
|
Igor Melnyk
|
Matthew Riemer
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent advances in Large Language Models (LLMs) have yielded impressive successes on many language tasks. However, efficient processing of long contexts using LLMs remains a significant challenge. We introduce **EpMAN** – a method for processing long contexts in an episodic memory module while holistically attending to semantically-relevant context chunks. Output from episodic attention is then used to reweigh the decoder’s self-attention to the stored KV cache of the context during training and generation. When an LLM decoder is trained using **EpMAN**, its performance on multiple challenging single-hop long-context recall and question-answering benchmarks is found to be stronger and more robust across the range from 16k to 256k tokens than baseline decoders trained with self-attention, and popular retrieval-augmented generation frameworks.
Search
Fix author
Co-authors
- Subhajit Chaudhury 1
- Payel Das 1
- Georgios Kollias 1
- Igor Melnyk 1
- Khushbu Pahwa 1
- show all...
Venues
- acl1