Hitendra Sarma Thogarcheti


2025

Dysarthria, a motor speech disorder affecting over 46 million individuals globally, impairs both intelligibility and emotional expression in communication. This work introduces a novel framework for emotion-aware sentence reconstruction from dysarthric speech using Large Language Models (LLMs) fine-tuned with QLoRA, namely LLaMA 3.1 and Mistral 8x7B. Our pipeline integrates direct emotion recognition from raw audio and conditions textual reconstruction on this emotional context to enhance both semantic and affective fidelity.We propose the Multimodal Communication Dysarthria Score (MCDS), a holistic evaluation metric combining BLEU, semantic similarity, emotion consistency, and human ratings:MCDS=αB+βE+γS+δHwhere 𝛼 + 𝛽 + 𝛾 + 𝛿 = 1.On our extended TORGO+ dataset, our emotion-aware LLM model achieves a MCDS of 0.87 and BLEU of 72.4%, significantly outperforming traditional pipelines like Kaldi GMM-HMM (MCDS: 0.52, BLEU: 38.1%) and Whisper-based models. It also surpasses baseline LLM systems by 0.09 MCDS. This sets a new benchmark in emotionally intelligent dysarthric speech reconstruction, with future directions including multilingual support and real-time deployment.