CEN_Amrita@LT-EDI 2024: A Transformer based Speech Recognition System for Vulnerable Individuals in Tamil

Jairam R; Jyothish G; Premjith B.; Viswa M

CEN_Amrita@LT-EDI 2024: A Transformer based Speech Recognition System for Vulnerable Individuals in Tamil

Jairam R, Jyothish G, Premjith B, Viswa M

Abstract

Speech recognition is known to be a specialized application of speech processing. Automatic speech recognition (ASR) systems are designed to perform the speech-to-text task. Although ASR systems have been the subject of extensive research, they still encounter certain challenges when speech variations arise. The speaker’s age, gender, vulnerability, and other factors are the main causes of the variations in speech. In this work, we propose a fine-tuned speech recognition model for recognising the spoken words of vulnerable individuals in Tamil. This research utilizes a dataset sourced from the LT-EDI@EACL2024 shared task. We trained and tested pre-trained ASR models, including XLS-R and Whisper. The findings highlight that the fine-tuned Whisper ASR model surpasses the XLSR, achieving a word error rate (WER) of 24.452, signifying its superior performance in recognizing speech from diverse individuals.

Anthology ID:: 2024.ltedi-1.21
Volume:: Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
Month:: March
Year:: 2024
Address:: St. Julian's, Malta
Editors:: Bharathi Raja Chakravarthi, Bharathi B, Paul Buitelaar, Thenmozhi Durairaj, György Kovács, Miguel Ángel García Cumbreras
Venues:: LTEDI | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 190–195
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.ltedi-1.21/
DOI:
Bibkey:
Cite (ACL):: Jairam R, Jyothish G, Premjith B, and Viswa M. 2024. CEN_Amrita@LT-EDI 2024: A Transformer based Speech Recognition System for Vulnerable Individuals in Tamil. In Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 190–195, St. Julian's, Malta. Association for Computational Linguistics.
Cite (Informal):: CEN_Amrita@LT-EDI 2024: A Transformer based Speech Recognition System for Vulnerable Individuals in Tamil (R et al., LTEDI 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.ltedi-1.21.pdf
Video:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.ltedi-1.21.mp4

PDF Cite Search Video Fix data