SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling

Kadri Hacioglu; Manjunath K E; Andreas Stolcke

SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling

Kadri Hacioglu, Manjunath K E, Andreas Stolcke

Abstract

Slot filling is a crucial subtask in spoken language understanding (SLU), traditionally implemented as a cascade of speech recognition followed by one or more natural language understanding (NLU) components. The recent advent of speech-based large language models (speechLLMs), which integrate speech and textual foundation models, has opened new avenues for achieving speech understanding tasks in a more unified, generative, and instruction-following manner while promising data and compute efficiency with zero-shot abilities, generalizing to unseen slot labels. We address the slot-filling task by creating an empirical upper bound for the task, identifying performance, robustness, and generalization gaps, and proposing improvements to the training data, architecture, and training strategies to narrow the gap with the upper bound result. We show that each of these measures improve performance substantially, while highlighting practical challenges and providing empirical guidance and insights for harnessing these emerging models.

Anthology ID:: 2025.emnlp-industry.49
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:: November
Year:: 2025
Address:: Suzhou (China)
Editors:: Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 703–715
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.49/
DOI:
Bibkey:
Cite (ACL):: Kadri Hacioglu, Manjunath K E, and Andreas Stolcke. 2025. SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 703–715, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):: SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling (Hacioglu et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-industry.49.pdf

PDF Cite Search Fix data