Bypassing Neural Evaluations for Fast Audio Editing via Adaptive Trajectory Extrapolation

Xiaoqian Liu, Zhengkun Ge, Jianjin Wang, Haoran Zhang, Yuan Ge, Kaiyan Chang, Chen Xu, Tong Xiao, Zhengtao Yu, Linfeng Zhang, JingBo Zhu


Abstract
Recent advancements in audio diffusion models have significantly improved text-to-audio editing via inversion techniques. However, these models typically rely on dense, fixed-step sampling trajectories to maintain structural integrity during inversion and generation, leading to prohibitive computational costs. We propose AdaTE, a model-agnostic Adaptive Trajectory Extrapolation framework that accelerates the inversion-based editing process by dynamically evaluating only the most critical generative phases. Specifically, we introduce a hierarchical probing mechanism that monitors curvature acceleration and information gain to detect pivotal transitions within the latent flow. This allows the model to selectively skip redundant segments via linear extrapolation while preserving dense neural evaluations for complex semantic changes. Extensive experiments across AudioLDM2, Auffusion, and Tango2 demonstrate that AdaTE achieves up to a 3.9× speedup with negligible loss in fidelity. AdaTE significantly shifts the Pareto frontier, providing an efficient solution for high-fidelity audio synthesis and editing.
Anthology ID:
2026.findings-acl.820
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
16633–16647
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.820/
DOI:
Bibkey:
Cite (ACL):
Xiaoqian Liu, Zhengkun Ge, Jianjin Wang, Haoran Zhang, Yuan Ge, Kaiyan Chang, Chen Xu, Tong Xiao, Zhengtao Yu, Linfeng Zhang, and JingBo Zhu. 2026. Bypassing Neural Evaluations for Fast Audio Editing via Adaptive Trajectory Extrapolation. In Findings of the Association for Computational Linguistics: ACL 2026, pages 16633–16647, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Bypassing Neural Evaluations for Fast Audio Editing via Adaptive Trajectory Extrapolation (Liu et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.820.pdf
Checklist:
 2026.findings-acl.820.checklist.pdf