The Knowledge Microscope: Features as Better Analytical Lenses than Neurons

Yuheng Chen; Pengfei Cao; Kang Liu (刘康); Jun Zhao (军 赵)

The Knowledge Microscope: Features as Better Analytical Lenses than Neurons

Yuheng Chen, Pengfei Cao, Kang Liu, Jun Zhao

Abstract

We demonstrate that features, rather than neurons, serve as superior analytical units for understanding the mechanisms of factual knowledge in Language Models (LMs). Previous studies primarily utilize MLP neurons as units of analysis; however, neurons suffer from polysemanticity, leading to limited knowledge expression and poor interpretability. We first conduct preliminary experiments to validate that SAE can effectively decompose neurons into features. With this established, our core findings reveal three key advantages of features over neurons: (1) Features exhibit stronger influence on knowledge expression and superior interpretability. (2) Features demonstrate enhanced monosemanticity, showing distinct activation patterns between related and unrelated facts. (3) Feature-based method demonstrates superior performance over neuron-based approaches in erasing privacy-sensitive information from LMs. Additionally, we propose FeatureEdit, the first feature-based editing method. Code and dataset will be available.

Anthology ID:: 2025.acl-long.516
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 10493–10515
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.516/
DOI:
Bibkey:
Cite (ACL):: Yuheng Chen, Pengfei Cao, Kang Liu, and Jun Zhao. 2025. The Knowledge Microscope: Features as Better Analytical Lenses than Neurons. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10493–10515, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: The Knowledge Microscope: Features as Better Analytical Lenses than Neurons (Chen et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.516.pdf

PDF Cite Search Fix data