Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules
Amr Mohamed, Yang Zhang, Michalis Vazirgiannis, Guokan Shang
Abstract
Diffusion large language models (dLLMs) offer a promising alternative to autoregressive models, but their practical utility is severely hampered by slow, iterative sampling. We present *SchED*, a training-free, model-agnostic early-exit algorithm that terminates diffusion decoding using a progress-aware confidence threshold. We evaluate *SchED* across multiple diffusion model families and a diverse set of benchmarks spanning multiple-choice, math, long-form QA, and translation. *SchED* delivers substantial acceleration: on instruction-tuned models, it achieves approximately 4× speedups while retaining baseline performance on average. On base models, *SchED* yields consistent speedup gains with 99.1–100% performance retention, with up to 2.34× under more aggressive settings. Under a conservative quality–penalized speed metric, *SchED* consistently outperforms prior confidence-based early-exit methods, including on long-form generation where existing approaches tend to break down. An entropy analysis of the model’s token predictions reveals that instruction tuning speeds up the decay of predictive entropy. By leveraging inherent confidence stabilization as a signal for computational efficiency, *SchED* provides a robust framework for efficient dLLM inference.- Anthology ID:
- 2026.findings-acl.1782
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 35793–35807
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1782/
- DOI:
- Cite (ACL):
- Amr Mohamed, Yang Zhang, Michalis Vazirgiannis, and Guokan Shang. 2026. Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules. In Findings of the Association for Computational Linguistics: ACL 2026, pages 35793–35807, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Fast-Decoding Diffusion Language Models via Progress-Aware Confidence Schedules (Mohamed et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1782.pdf