Computation Mechanism Behind LLM Position Generalization

Chi Han; Heng Ji

Computation Mechanism Behind LLM Position Generalization

Abstract

Most written natural languages are composed of sequences of words and sentences. Similar to humans, large language models (LLMs) exhibit flexibility in handling textual positions - a phenomenon we term Position Generalization. They can understand texts with position perturbations and generalize to longer texts than those encountered during training with the latest techniques. These phenomena suggest that LLMs handle positions in a tolerant manner, but how LLMs computationally process positional relevance remains largely unexplored. In this work, we show how LLMs enforce certain computational mechanisms to allow for the aforementioned tolerance in position perturbations. Despite the complex design of the self-attention mechanism, in this work, LLMs are revealed to learn a counterintuitive disentanglement of attention logits, where their values show a 0.959 linear correlation with an approximation of the arithmetic sum of positional relevance and semantic importance. Furthermore, we identify a prevalent pattern in intermediate features that enables this effect, suggesting that it is a learned behavior rather than a natural result of the model architecture. Based on these findings, we provide computational explanations and criteria for the aforementioned position flexibilities observed in LLMs.

Anthology ID:: 2025.acl-long.953
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19408–19424
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.953/
DOI:
Bibkey:
Cite (ACL):: Chi Han and Heng Ji. 2025. Computation Mechanism Behind LLM Position Generalization. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 19408–19424, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Computation Mechanism Behind LLM Position Generalization (Han & Ji, ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.953.pdf

PDF Cite Search Fix data