Probing and Boosting Large Language Models Capabilities via Attention Heads

Dezhi Zhao, Xin Liu, Xiaocheng Feng, Hui Wang, Bing Qin


Abstract
Understanding the internal origins of capabilities in large language models (LLMs) is crucial for interpretability and efficient adaptation. However, the emergence of specific capabilities remains poorly understood, as most existing approaches rely on external signals (e.g., performance shifts or gradient similarities) with limited structural grounding. To address these issues, this paper proposes a lightweight and highly interpretable approach that links LLM capabilities to internal components by identifying correspondences at the level of attention heads. Specifically, we first define five fundamental capabilities, namely Mathematical Reasoning, Reading Comprehension, Commonsense Reasoning, Scientific Reasoning, and Professional Expertise, and employ probing techniques to detect the attention heads most predictive of each, thereby establishing capability–head mappings. For targeted instruction tuning, complex tasks are decomposed into these fundamental capabilities, and training data are selected accordingly. Experiments on LLaMA3.1-8B and Qwen2.5-7B show over 70% discrimination accuracy in identifying capabilities. On MMLU and BBH, our method improves accuracy by 1 to 1.5 points over the gradient-based method LESS and by 5 to 6 points over other intermediate-state baselines.
Anthology ID:
2025.emnlp-main.1450
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
28518–28532
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1450/
DOI:
Bibkey:
Cite (ACL):
Dezhi Zhao, Xin Liu, Xiaocheng Feng, Hui Wang, and Bing Qin. 2025. Probing and Boosting Large Language Models Capabilities via Attention Heads. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 28518–28532, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Probing and Boosting Large Language Models Capabilities via Attention Heads (Zhao et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1450.pdf
Checklist:
 2025.emnlp-main.1450.checklist.pdf