Recent Advances in Speech Language Models: A Survey
Wenqian Cui, Dianzhi Yu, Xiaoqi Jiao, Ziqiao Meng, Guangyan Zhang, Qichao Wang, Steven Y. Guo, Irwin King
Abstract
Text-based Large Language Models (LLMs) have recently gained significant attention, primarily for their capabilities in text-based interactions. However, natural human interaction often relies on speech, highlighting the need for voice-based models. In this context, Speech Language Models (SpeechLMs)—foundation models designed to understand and generate speech—emerge as a promising solution for end-to-end speech interaction. This survey offers a comprehensive overview of recent approaches to building SpeechLMs, outlining their core architectural components, training methodologies, evaluation strategies, and the challenges and potential directions for future research in this rapidly advancing field. The GitHub repository is available at https://github.com/dreamtheater123/Awesome-SpeechLM-Survey- Anthology ID:
- 2025.acl-long.682
- Volume:
- Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 13943–13970
- Language:
- URL:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.682/
- DOI:
- Cite (ACL):
- Wenqian Cui, Dianzhi Yu, Xiaoqi Jiao, Ziqiao Meng, Guangyan Zhang, Qichao Wang, Steven Y. Guo, and Irwin King. 2025. Recent Advances in Speech Language Models: A Survey. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13943–13970, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Recent Advances in Speech Language Models: A Survey (Cui et al., ACL 2025)
- PDF:
- https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.682.pdf