CSPB: Conversational Speech Processing Benchmark for Self-supervised Speech Models
Zili Huang, Matthew Maciejewski, Leibny Paola Garcia Perera, Shinji Watanabe, Sanjeev Khudanpur
Abstract
Recent advances in self-supervised learning (SSL) have led to powerful speech representation models, yet their robustness in real-world conversational settings remains largely untested. Most existing benchmarks focus on clean, single-speaker, single-channel audio, failing to reflect the complexities of natural human interaction—where background noise, reverberation, and overlapping speech are the norm. To bridge these critical gaps, we present the Conversational Speech Processing Benchmark (CSPB), a new benchmark designed to assess the robustness of SSL speech models in realistic conversational scenarios. CSPB is constructed from four multi-party datasets—AMI, AliMeeting, MMCSG, and DiPCo—and supports both single-channel and multi-channel evaluation. By releasing CSPB as an open-source toolkit, we aim to establish a unified framework for evaluating and advancing robust, spatially-aware self-supervised speech models.- Anthology ID:
- 2026.eacl-long.275
- Volume:
- Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Vera Demberg, Kentaro Inui, Lluís Marquez
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5878–5893
- Language:
- URL:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.275/
- DOI:
- Cite (ACL):
- Zili Huang, Matthew Maciejewski, Leibny Paola Garcia Perera, Shinji Watanabe, and Sanjeev Khudanpur. 2026. CSPB: Conversational Speech Processing Benchmark for Self-supervised Speech Models. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5878–5893, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- CSPB: Conversational Speech Processing Benchmark for Self-supervised Speech Models (Huang et al., EACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-eacl/2026.eacl-long.275.pdf