Disfluent Cues for Enhanced Speech Understanding in Large Language Models

Morteza Rohanian; Farhad Nooralahzadeh; Omid Rohanian; David Clifton; Michael Krauthammer

doi:10.18653/v1/2023.findings-emnlp.238

Disfluent Cues for Enhanced Speech Understanding in Large Language Models

Morteza Rohanian, Farhad Nooralahzadeh, Omid Rohanian, David Clifton, Michael Krauthammer

Abstract

In computational linguistics, the common practice is to “clean” disfluent content from spontaneous speech. However, we hypothesize that these disfluencies might serve as more than mere noise, potentially acting as informative cues. We use a range of pre-trained models for a reading comprehension task involving disfluent queries, specifically featuring different types of speech repairs. The findings indicate that certain disfluencies can indeed improve model performance, particularly those stemming from context-based adjustments. However, large-scale language models struggle to handle repairs involving decision-making or the correction of lexical or syntactic errors, suggesting a crucial area for potential improvement. This paper thus highlights the importance of a nuanced approach to disfluencies, advocating for their potential utility in enhancing model performance rather than their removal.

Anthology ID:: 2023.findings-emnlp.238
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2023
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3676–3684
Language:
URL:: https://aclanthology.org/2023.findings-emnlp.238
DOI:: 10.18653/v1/2023.findings-emnlp.238
Bibkey:
Cite (ACL):: Morteza Rohanian, Farhad Nooralahzadeh, Omid Rohanian, David Clifton, and Michael Krauthammer. 2023. Disfluent Cues for Enhanced Speech Understanding in Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3676–3684, Singapore. Association for Computational Linguistics.
Cite (Informal):: Disfluent Cues for Enhanced Speech Understanding in Large Language Models (Rohanian et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/improve-issue-templates/2023.findings-emnlp.238.pdf

PDF Search