Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models

Youngji Roh, Hyunjin Cho, Jaehyung Kim


Abstract
Large Language Models (LLMs) exhibit highly anisotropic internal representations, often characterized by massive activations, a phenomenon where a small subset of feature dimensions possesses magnitudes significantly larger than the rest. While prior works view these extreme dimensions primarily as artifacts to be managed, we propose a distinct perspective: these dimensions serve as intrinsic interpretable functional units arising from domain specialization. Specifically, we propose a simple magnitude-based criterion to identify Domain-Critical Dimensions in a training-free manner. Our analyses reveal that such dimensions behave as interpretable semantic detectors for symbolic/quantitative patterns or domain-specific terms. In addition, we introduce Critical Dimension Steering, which applies activation steering exclusively to the identified dimensions. Empirical results show that this approach outperforms conventional whole-dimension steering in domain adaptation and jailbreaking scenarios.
Anthology ID:
2026.acl-long.1380
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29930–29956
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1380/
DOI:
Bibkey:
Cite (ACL):
Youngji Roh, Hyunjin Cho, and Jaehyung Kim. 2026. Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 29930–29956, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Embracing Anisotropy: Turning Massive Activations into Interpretable Control Knobs for Large Language Models (Roh et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.1380.pdf
Checklist:
 2026.acl-long.1380.checklist.pdf