DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration

Yanru Huo, Ziyue Jiang, Zuoli Tang, Qingyang Hong, Zhou Zhao


Abstract
While Diffusion Transformers (DiT) have advanced non-autoregressive (NAR) speech synthesis, their high computational demands remain an obvious limitation. Existing DiT-based text-to-speech (TTS) model acceleration approaches predominantly focus on reducing sampling steps through distillation techniques, yet they remain constrained by training costs. We introduce DiTReducio, a training-free acceleration framework that compresses computations in DiT-based TTS models through a progressive calibration process. We propose two compression methods, Temporal Skipping and Branch Skipping, to eliminate redundant computations during inference. Moreover, based on two characteristic attention patterns identified within DiT layers, we devise a pattern-guided strategy to selectively apply the compression methods. Our method allows flexible modulation between generation quality and computational efficiency through adjustable compression thresholds. Experimental evaluations conducted on F5-TTS and MegaTTS 3 demonstrate that DiTReducio achieves a 75.4% reduction in FLOPs and improves the Real-Time Factor (RTF) by 37.1%, while preserving generation quality. The code is available at https://github.com/MM-Speech/DiTReducio.
Anthology ID:
2026.findings-acl.1157
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23101–23116
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1157/
DOI:
Bibkey:
Cite (ACL):
Yanru Huo, Ziyue Jiang, Zuoli Tang, Qingyang Hong, and Zhou Zhao. 2026. DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration. In Findings of the Association for Computational Linguistics: ACL 2026, pages 23101–23116, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
DiTReducio: A Training-Free Acceleration for DiT-Based TTS via Progressive Calibration (Huo et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1157.pdf
Checklist:
 2026.findings-acl.1157.checklist.pdf