Yuheng Chen

2025

pdf bib abs
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models
Yuheng Chen | Pengfei Cao | Yubo Chen | Yining Wang | Shengping Liu | Kang Liu | Jun Zhao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Knowledge neuron theory provides a key approach to understanding the mechanisms of factual knowledge in Large Language Models (LLMs), which suggests that facts are stored within multi-layer perceptron neurons. This paper further explores **Degenerate Knowledge Neurons** (DKNs), where distinct sets of neurons can store identical facts, but unlike simple redundancy, they also participate in storing other different facts. Despite the novelty and unique properties of this concept, it has not been rigorously defined and systematically studied. Our contributions are: (1) We pioneer the study of structures in knowledge neurons by analyzing weight connection patterns, providing a comprehensive definition of DKNs from both functional and structural aspects. (2) Based on this definition, we develop the **Neuronal Topology Clustering** method, leading to a more accurate DKN identification. (3) We demonstrate the practical applications of DKNs in two aspects: guiding LLMs to learn new knowledge and relating to LLMs’ robustness against input errors.

pdf bib abs
The Knowledge Microscope: Features as Better Analytical Lenses than Neurons
Yuheng Chen | Pengfei Cao | Kang Liu | Jun Zhao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We demonstrate that features, rather than neurons, serve as superior analytical units for understanding the mechanisms of factual knowledge in Language Models (LMs). Previous studies primarily utilize MLP neurons as units of analysis; however, neurons suffer from polysemanticity, leading to limited knowledge expression and poor interpretability. We first conduct preliminary experiments to validate that SAE can effectively decompose neurons into features. With this established, our core findings reveal three key advantages of features over neurons: (1) Features exhibit stronger influence on knowledge expression and superior interpretability. (2) Features demonstrate enhanced monosemanticity, showing distinct activation patterns between related and unrelated facts. (3) Feature-based method demonstrates superior performance over neuron-based approaches in erasing privacy-sensitive information from LMs. Additionally, we propose FeatureEdit, the first feature-based editing method. Code and dataset will be available.

2024

pdf bib abs
Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons
Yifei Wang | Yuheng Chen | Wanting Wen | Yu Sheng | Linjing Li | Daniel Dajun Zeng
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

In this paper, we investigate whether Large Language Models (LLMs) actively recall or retrieve their internal repositories of factual knowledge when faced with reasoning tasks. Through an analysis of LLMs’ internal factual recall at each reasoning step via Knowledge Neurons, we reveal that LLMs fail to harness the critical factual associations under certain circumstances. Instead, they tend to opt for alternative, shortcut-like pathways to answer reasoning questions. By manually manipulating the recall process of parametric knowledge in LLMs, we demonstrate that enhancing this recall process directly improves reasoning performance whereas suppressing it leads to notable degradation. Furthermore, we assess the effect of Chain-of-Thought (CoT) prompting, a powerful technique for addressing complex reasoning tasks. Our findings indicate that CoT can intensify the recall of factual knowledge by encouraging LLMs to engage in orderly and reliable reasoning. Furthermore, we explored how contextual conflicts affect the retrieval of facts during the reasoning process to gain a comprehensive understanding of the factual recall behaviors of LLMs. Code and data will be available soon.

Co-authors

Venues

acl2
emnlp1

Fix author