Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features
Eliana Pastor, Alkis Koudounas, Giuseppe Attanasio, Dirk Hovy, Elena Baralis
Abstract
Predictive models make mistakes and have biases. To combat both, we need to understand their predictions.Explainable AI (XAI) provides insights into models for vision, language, and tabular data. However, only a few approaches exist for speech classification models. Previous works focus on a selection of spoken language understanding (SLU) tasks, and most users find their explanations challenging to interpret.We propose a novel approach to explain speech classification models. It provides two types of insights. (i) Word-level. We measure the impact of each audio segment aligned with a word on the outcome. (ii) Paralinguistic. We evaluate how non-linguistic features (e.g., prosody and background noise) affect the outcome if perturbed.We validate our approach by explaining two state-of-the-art SLU models on two tasks in English and Italian. We test their plausibility with human subject ratings. Our results show that the explanations correctly represent the model’s inner workings and are plausible to humans.- Anthology ID:
- 2024.eacl-long.136
- Volume:
- Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- March
- Year:
- 2024
- Address:
- St. Julian’s, Malta
- Editors:
- Yvette Graham, Matthew Purver
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2221–2238
- Language:
- URL:
- https://aclanthology.org/2024.eacl-long.136
- DOI:
- Cite (ACL):
- Eliana Pastor, Alkis Koudounas, Giuseppe Attanasio, Dirk Hovy, and Elena Baralis. 2024. Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2221–2238, St. Julian’s, Malta. Association for Computational Linguistics.
- Cite (Informal):
- Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features (Pastor et al., EACL 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.eacl-long.136.pdf