Task-Level Instructions Induction for Audio Question Answering from Few Examples

Po Chun Chen; Hen-Hsen Huang; Hsin-Hsi Chen

Task-Level Instructions Induction for Audio Question Answering from Few Examples

Po-Chun Chen, Hen-Hsen Huang, Hsin-Hsi Chen

Abstract

Large audio-language models (LALMs) benefit from Chain-of-Thought (CoT) prompting for audio question answering (AQA), but acquiring audio CoT examples is particularly challenging as it requires sequential listening and careful integration of acoustic and linguistic information. Surprisingly, our experiments reveal that standard few-shot prompting yields inconsistent results compared to zero-shot CoT, with several models showing degraded accuracy. Moreover, few-shot prompting incurs substantially higher inference costs by processing multiple audio demonstrations per inference. We propose Audio-Induct, which induces reusable textual task instructions from few audio examples once per task, requiring no additional demonstrations at inference. Evaluated on 9 LALMs across two benchmarks, Audio-Induct outperforms state-of-the-art prompting methods while maintaining low inference costs. Inducted Task Instructions transfer effectively across models, enabling scalable deployment.

Anthology ID:: 2026.eacl-short.18
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 244–264
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-short.18/
DOI:
Bibkey:
Cite (ACL):: Po-Chun Chen, Hen-Hsen Huang, and Hsin-Hsi Chen. 2026. Task-Level Instructions Induction for Audio Question Answering from Few Examples. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers), pages 244–264, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Task-Level Instructions Induction for Audio Question Answering from Few Examples (Chen et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-short.18.pdf
Checklist:: 2026.eacl-short.18.checklist.pdf

PDF Cite Search Checklist Fix data