A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation

Yan Li; Tianyi Zhang; Zechuan Li; Caren Han

A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation

Yan Li, Tianyi Zhang, Zechuan Li, Caren Han

Abstract

Transformer-based Large Language Models (LLMs) struggle with inputs exceeding their training context window due to positional out-of-distribution (O.O.D.) issues that disrupt attention. Existing solutions, including fine-tuning and training-free methods, face challenges like inefficiency, redundant interpolation, logit outliers, or loss of local positional information. We propose Greedy Attention Logit Interpolation (GALI), a training-free method that improves length extrapolation by greedily reusing pretrained positional intervals and interpolating attention logits to eliminate outliers. GALI achieves stable and superior performance across a wide range of long-context tasks without requiring input-length-specific tuning. Our analysis further reveals that LLMs interpret positional intervals unevenly and that restricting interpolation to narrower ranges improves performance, even on short-context tasks. GALI represents a step toward more robust and generalizable long-text processing in LLMs.

Anthology ID:: 2025.emnlp-main.443
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8784–8804
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.443/
DOI:
Bibkey:
Cite (ACL):: Yan Li, Tianyi Zhang, Zechuan Li, and Caren Han. 2025. A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 8784–8804, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: A Training-Free Length Extrapolation Approach for LLMs: Greedy Attention Logit Interpolation (Li et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.443.pdf
Checklist:: 2025.emnlp-main.443.checklist.pdf

PDF Cite Search Checklist Fix data