Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models
Ahmad Dawar Hakimi, Ali Modarressi, Philipp Wicke, Hinrich Schuetze
Abstract
Understanding how large language models (LLMs) acquire and store factual knowledge is crucial for enhancing their interpretability, reliability, and efficiency. In this work, we analyze the evolution of factual knowledge representation in the OLMo-7B model by tracking the roles of its Attention Heads and Feed Forward Networks (FFNs) over training. We classify these components into four roles—general, entity, relation-answer, and fact-answer specific—and examine their stability and transitions. Our results show that LLMs initially depend on broad, general-purpose components, which later specialize as training progresses. Once the model reliably predicts answers, some components are repurposed, suggesting an adaptive learning process. Notably, answer-specific attention heads display the highest turnover, whereas FFNs remain stable, continually refining stored knowledge. These insights offer a mechanistic view of knowledge formation in LLMs and have implications for model pruning, optimization, and transparency.- Anthology ID:
- 2025.findings-acl.654
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2025
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 12633–12653
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2025-08/2025.findings-acl.654/
- DOI:
- 10.18653/v1/2025.findings-acl.654
- Cite (ACL):
- Ahmad Dawar Hakimi, Ali Modarressi, Philipp Wicke, and Hinrich Schuetze. 2025. Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 12633–12653, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models (Hakimi et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2025-08/2025.findings-acl.654.pdf