From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons

Xiangyu Ma, Teng Xiao, Zuchao Li, Lefei Zhang


Abstract
Diffusion models promise efficient parallel text generation but rely on bidirectional attention, creating a structural mismatch with pre-trained Autoregressive (AR) models. This incompatibility precludes reusing robust AR priors, necessitating prohibitive pre-training from scratch. To bridge this gap, we propose FLUID, a framework that efficiently adapts AR backbones to the diffusion paradigm. By enforcing Strictly Causal Alignment, FLUID enables seamless initialization from standard GPT-style checkpoints, circumventing the need for massive pre-training. Furthermore, we introduce Elastic Horizons, an entropy-driven mechanism that dynamically modulates denoising strides based on local information density rather than fixed schedules. Experiments demonstrate that FLUID achieves state-of-the-art performance while reducing training costs by orders of magnitude, effectively reconciling established AR foundations with efficient parallel generation. Our code is available at https://huggingface.co/MYTH-Lab/FLUID.
Anthology ID:
2026.acl-long.958
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20914–20927
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.958/
DOI:
Bibkey:
Cite (ACL):
Xiangyu Ma, Teng Xiao, Zuchao Li, and Lefei Zhang. 2026. From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 20914–20927, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons (Ma et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.958.pdf
Checklist:
 2026.acl-long.958.checklist.pdf