DINT Transformer

Yueyang Cang, Yuhang Liu, Xiaoteng Zhang, Erlu Zhao, Li Shi


Abstract
The DIFF Transformer mitigates interference from irrelevant contexts by introducing a differential attention mechanism, thereby enhancing focus on critical tokens. However, this architecture suffers from two major limitations: first, its use of two independent attention matrices leads to numerical instability, and second, it lacks global context modeling, which is essential for identifying globally significant tokens. To address these challenges, we propose the DINT Transformer, which extends the DIFF Transformer by incorporating an integral mechanism. By computing global importance scores and integrating them into the attention matrix, the DINT Transformer not only improves overall numerical stability but also significantly enhances its ability to capture global dependencies. Experimental results demonstrate that the DINT Transformer achieves superior accuracy and robustness across various practical applications, including long-context language modeling and key information retrieval. These advancements establish the DINT Transformer as a highly effective and promising architecture.
Anthology ID:
2025.emnlp-main.495
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9812–9820
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.495/
DOI:
Bibkey:
Cite (ACL):
Yueyang Cang, Yuhang Liu, Xiaoteng Zhang, Erlu Zhao, and Li Shi. 2025. DINT Transformer. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 9812–9820, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
DINT Transformer (Cang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.495.pdf
Checklist:
 2025.emnlp-main.495.checklist.pdf