Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models

Zhichao Sheng, Shilin Zhou, Chen Gong, Zhenghua Li


Abstract
Large Audio Language Models (LALMs) employing the Chain-of-Thought paradigm have demonstrated remarkable reasoning capabilities. Though different problems naturally require varying depths of reasoning, existing methods often determine whether to perform reasoning, lacking fine-grained mechanisms to adapt reasoning length to problem complexity. As a result, LALMs often adopt a one-size-fits-all reasoning strategy, leading to redundant overthinking for simple tasks and insufficient reasoning for complex ones. In this paper, we conduct an in-depth analysis of LALM reasoning behavior and argue that effective and efficient reasoning should be adaptively aligned with task difficulty. To this end, we propose a difficulty-adaptive reasoning method for LALMs. Specifically, we introduce a reward function that dynamically links reasoning length to the model’s perceived problem difficulty, encouraging shorter reasoning for easy tasks and longer reasoning for more complex ones. Extensive experiments on three datasets demonstrate that our method consistently improves performance while reducing average reasoning length by at least 50%, achieving higher efficiency without sacrificing accuracy.
Anthology ID:
2026.findings-acl.1640
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
32771–32790
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1640/
DOI:
Bibkey:
Cite (ACL):
Zhichao Sheng, Shilin Zhou, Chen Gong, and Zhenghua Li. 2026. Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 32771–32790, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models (Sheng et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1640.pdf
Checklist:
 2026.findings-acl.1640.checklist.pdf