Efficient Transformer Parameter Reuse via Zero-Token Mechanism

Guanghao Li, Wenhao Jiang, Li Shen, Ming Tang, Chun Yuan


Abstract
Resource constraints often limit the parameter capacity of Large Language Models (LLMs), thereby hindering their performance. Although existing approaches leverage parameter sharing to reuse a fixed set of parameters within constrained budgets, they typically require each layer to fulfill multiple roles over a fixed number of iterations. This design compromises both efficiency and adaptability. In this work, we propose the **Zero Token Transformer (ZTT)**, which employs a head-tail decoupled parameter cycling strategy. Specifically, we decouple the first (head) and last (tail) layers from the parameter cycling process, enabling iterative refinement solely within the intermediate layers. Furthermore, we introduce a Zero-Token Mechanism, wherein a virtual token with a trainable key and a zero-valued vector functions as a standard token. The resulting attention scores not only reflect the computational significance of each layer but also facilitate dynamic early exiting, thereby preserving overall model accuracy. Our approach achieves superior performance under strict parameter constraints, substantially reduces computational overhead via early exits, and can be seamlessly integrated into the fine-tuning of existing pre-trained models, improving both efficiency and adaptability.
Anthology ID:
2026.findings-acl.711
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14498–14515
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.711/
DOI:
Bibkey:
Cite (ACL):
Guanghao Li, Wenhao Jiang, Li Shen, Ming Tang, and Chun Yuan. 2026. Efficient Transformer Parameter Reuse via Zero-Token Mechanism. In Findings of the Association for Computational Linguistics: ACL 2026, pages 14498–14515, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Efficient Transformer Parameter Reuse via Zero-Token Mechanism (Li et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.711.pdf
Checklist:
 2026.findings-acl.711.checklist.pdf