Chih-Hao Hsu
2026
Role-Sensitive Neurons: A Neuron-Level Gain Control Mechanism for Confidence Steering
Peiwen Huang | Chih-Hao Hsu | Tzu-Hung Huang | Shou-De Lin
Findings of the Association for Computational Linguistics: ACL 2026
Peiwen Huang | Chih-Hao Hsu | Tzu-Hung Huang | Shou-De Lin
Findings of the Association for Computational Linguistics: ACL 2026
Role-playing prompts effectively steer Large Language Models (LLMs), yet the neural mechanism driving this behavioral shift remains unclear. In this work, we identify Role-Sensitive Neurons (RSNs)—a sparse sub-network (≈ 0.5% of all neurons) governing the transition from hesitation to action. Using a novel evaluation framework with explicit abstention (MMLU-E), we reveal a Confidence-Performance Decoupling: roles primarily modulate the model’s probabilistic "willingness to act" rather than its underlying knowledge representation. We demonstrate that RSNs function as a mechanistic gain control system: causal intervention on this subspace allows precise regulation of abstention behavior. Furthermore, cross-model transfer experiments confirm that these circuits are indigenous to pre-training, with Instruction Tuning (SFT) acting merely as a "signal sharpener" to refine latent gain dynamics. Finally, we identify a critical safety boundary: in knowledge-deficient models, amplifying RSNs induces "unwarranted certainty," highlighting decisiveness as a tunable gain parameter distinct from epistemic truth.