Speech-Controlled Smart Speaker for Accurate, Real-Time Health and Care Record Management

Jonathan E. Carrick, Nina Dethlefs, Lisa Greaves, Venkata M. V. Gunturi, Rameez Raja Kureshi, Yongqiang Cheng


Abstract
To help alleviate the pressures felt by care workers, we have begun new research into improving the efficiency of care plan management by advancing recent developments in automatic speech recognition. Our novel approach adapts off-the-shelf tools in a purpose-built application for the speech domain, addressing challenges of accent adaption, real-time processing and speech hallucinations. We augment the speech-recognition scope of Open AI’s Whisper model through fine-tuning, reducing word error rates (WERs) from 16.8 to 1.0 on a range of British dialects. Addressing the speech-hallucination side effect of adapting to real-time recognition by enforcing a signal-to-noise ratio threshold and audio stream checks, we achieve a WER of 5.1, compared to 14.9 with Whisper’s original model. These ongoing research efforts tackle challenges that are necessary to build the speech-control basis for a custom smart speaker system that is both accurate and timely.
Anthology ID:
2025.iwsds-1.25
Volume:
Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology
Month:
May
Year:
2025
Address:
Bilbao, Spain
Editors:
Maria Ines Torres, Yuki Matsuda, Zoraida Callejas, Arantza del Pozo, Luis Fernando D'Haro
Venues:
IWSDS | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
238–244
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.iwsds-1.25/
DOI:
Bibkey:
Cite (ACL):
Jonathan E. Carrick, Nina Dethlefs, Lisa Greaves, Venkata M. V. Gunturi, Rameez Raja Kureshi, and Yongqiang Cheng. 2025. Speech-Controlled Smart Speaker for Accurate, Real-Time Health and Care Record Management. In Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology, pages 238–244, Bilbao, Spain. Association for Computational Linguistics.
Cite (Informal):
Speech-Controlled Smart Speaker for Accurate, Real-Time Health and Care Record Management (Carrick et al., IWSDS 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.iwsds-1.25.pdf