Speech-Integrated Modeling for Behavioral Coding in Counseling
Do June Min, Verónica Pérez-Rosas, Kenneth Resnicow, Rada Mihalcea
Abstract
Computational models of psychotherapy often ignore vocal cues by relying solely on text. To address this, we propose MISQ, a framework that integrates speech features directly into language models using a speech encoder and lightweight adapter. MISQ improves behavioral analysis in counseling conversations, achieving ~5% relative gains over text-only or indirect speech methods—underscoring the value of vocal signals like tone and prosody.- Anthology ID:
- 2025.sigdial-1.10
- Volume:
- Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Month:
- August
- Year:
- 2025
- Address:
- Avignon, France
- Editors:
- Frédéric Béchet, Fabrice Lefèvre, Nicholas Asher, Seokhwan Kim, Teva Merlin
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 152–158
- Language:
- URL:
- https://preview.aclanthology.org/add-orcids-2023-acl/2025.sigdial-1.10/
- DOI:
- Cite (ACL):
- Do June Min, Verónica Pérez-Rosas, Kenneth Resnicow, and Rada Mihalcea. 2025. Speech-Integrated Modeling for Behavioral Coding in Counseling. In Proceedings of the 26th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 152–158, Avignon, France. Association for Computational Linguistics.
- Cite (Informal):
- Speech-Integrated Modeling for Behavioral Coding in Counseling (Min et al., SIGDIAL 2025)
- PDF:
- https://preview.aclanthology.org/add-orcids-2023-acl/2025.sigdial-1.10.pdf