Neuron Empirical Gradient: Discovering and Quantifying Neurons’ Global Linear Controllability

Xin Zhao, Zehui Jiang, Naoki Yoshinaga


Abstract
While feed-forward neurons in pre-trained language models (PLMs) can encode knowledge, past research targeted a small subset of neurons that heavily influence outputs.This leaves the broader role of neuron activations unclear, limiting progress in areas like knowledge editing.We uncover a global linear relationship between neuron activations and outputs using neuron interventions on a knowledge probing dataset.The gradient of this linear relationship, which we call the **neuron empirical gradient (NEG)**, captures how changes in activations affect predictions.To compute NEG efficiently, we propose **NeurGrad**, enabling large-scale analysis of neuron behavior in PLMs.We also show that NEG effectively captures language skills across diverse prompts through skill neuron probing. Experiments on **MCEval8k**, a multi-genre multiple-choice knowledge benchmark, support NEG’s ability to represent model knowledge. Further analysis highlights the key properties of NEG-based skill representation: efficiency, robustness, flexibility, and interdependency.Code and data are released.
Anthology ID:
2025.acl-long.1041
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
21446–21477
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1041/
DOI:
Bibkey:
Cite (ACL):
Xin Zhao, Zehui Jiang, and Naoki Yoshinaga. 2025. Neuron Empirical Gradient: Discovering and Quantifying Neurons’ Global Linear Controllability. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 21446–21477, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Neuron Empirical Gradient: Discovering and Quantifying Neurons’ Global Linear Controllability (Zhao et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1041.pdf