The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

Yihuai Hong; Meng Cao; Dian Zhou; Lei Yu; Zhijing Jin

The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction

Yihuai Hong, Meng Cao, Dian Zhou, Lei Yu, Zhijing Jin

Abstract

Large language models (LLMs) excel on a variety of reasoning benchmarks, but previous studies suggest they sometimes struggle to generalize to unseen questions, potentially due to over-reliance on memorized training examples. However, the precise conditions under which LLMs switch between reasoning and memorization during text generation remain unclear. In this work, we provide a mechanistic understanding of LLMs’ reasoning-memorization dynamics by identifying a set of linear features in the model’s residual stream that govern the balance between genuine reasoning and memory recall. These features not only distinguish reasoning tasks from memory-intensive ones but can also be manipulated to causally influence model performance on reasoning tasks. Additionally, we show that intervening in these reasoning features helps the model more accurately activate the most relevant problem-solving capabilities during answer generation. Our findings offer new insights into the underlying mechanisms of reasoning and memory in LLMs and pave the way for the development of more robust and interpretable generative AI systems. Our code and data are at https://github.com/yihuaihong/Linear_Reasoning_Memory_Features.

Anthology ID:: 2025.findings-acl.1111
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21565–21585
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.findings-acl.1111/
DOI:
Bibkey:
Cite (ACL):: Yihuai Hong, Meng Cao, Dian Zhou, Lei Yu, and Zhijing Jin. 2025. The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction. In Findings of the Association for Computational Linguistics: ACL 2025, pages 21565–21585, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction (Hong et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.findings-acl.1111.pdf

PDF Cite Search Fix data