Qilei Zhang


2025

pdf bib
PEPE: Long-context Extension for Large Language Models via Periodic Extrapolation Positional Encodings
Jikun Hu | Dongsheng Guo | Yuli Liu | Qingyao Ai | Lixuan Wang | Xuebing Sun | Qilei Zhang | Quan Zhou | Cheng Luo
Findings of the Association for Computational Linguistics: EMNLP 2025

Long-context extension seeks to expand the contextual window in pre-trained large language models (LLMs), allowing them to handle several multiples of their original training context lengths. The primary method for extending the window length involves expanding the initial positional encodings, such as interpolating and extrapolation new positions based on Rotary Position Embedding (RoPE). This expansion inevitably disrupts the positional encodings learned during pre-training, thereby affecting the attention allotment and introducing unseen positional encoding distributions. To address this issue, we propose a new extension strategy based on RoPE, namely Periodic Extrapolation Positional Encodings (PEPE). This strategy expands pre-trained high dimensional components of positional encodings by replicating them in a periodic manner, thereby neither altering the learned positional encoding spaces nor introducing new positional encoding distributions. Experiments demonstrate that PEPE-based approaches can significantly improve long-context extension capabilities using just one-fourth the fine-tuning steps required by state-of-the-art methods. In addition, we analyze the characteristics of PEPE based methods and the key parameters that contribute to their effectiveness. The code is publicly available.