2024
pdf
abs
Overview of the Third Shared Task on Speech Recognition for Vulnerable Individuals in Tamil
Bharathi B
|
Bharathi Raja Chakravarthi
|
Sripriya N
|
Rajeswari Natarajan
|
Suhasini S
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
The overview of the shared task on speech recognition for vulnerable individuals in Tamil (LT-EDI-2024) is described in this paper. The work comes with a Tamil dataset that was gath- ered from elderly individuals who identify as male, female, or transgender. The audio sam- ples were taken in public places such as marketplaces, vegetable shops, hospitals, etc. The training phase and the testing phase are when the dataset is made available. The task required of the participants was to handle audio signals using various models and techniques, and then turn in their results as transcriptions of the pro- vided test samples. The participant’s results were assessed using WER (Word Error Rate). The transformer-based approach was employed by the participants to achieve automatic voice recognition. This overview paper discusses the findings and various pre-trained transformer- based models that the participants employed.
pdf
abs
ASR TAMIL SSN@ LT-EDI-2024: Automatic Speech Recognition system for Elderly People
Suhasini S
|
Bharathi B
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
The results of the Shared Task on Speech Recognition for Vulnerable Individuals in Tamil (LT-EDI-2024) are discussed in this paper. The goal is to create an automated system for Tamil voice recognition. The older population that speaks Tamil is the source of the dataset used in this task. The proposed ASR system is designed with pre-trained model akashsivanandan/wav2vec2-large-xls-r300m-tamil-colab-final. The Tamil common speech dataset is utilized to fine-tune the pretrained model that powers our system. The suggested system receives the test data that was released from the task; transcriptions are then created for the test samples and delivered to the task. Word Error Rate (WER) is the evaluation statistic used to assess the provided result based on the task. Our Proposed system attained a WER of 29.297%.
2023
pdf
abs
ASR_SSN_CSE@LTEDI- 2023: Pretrained Transformer based Automatic Speech Recognition system for Elderly People
Suhasini S
|
Bharathi B
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion
Submission of the paper for the result submitted in Shared Task on Speech Recognition for Vulnerable Individuals in Tamil- LT-EDI-2023. The task is to develop an automatic speech recognition system for Tamil language. The dataset provided in the task is collected from the elderly people who converse in Tamil language. The proposed ASR system is designed with pre-trained model. The pre-trained model used in our system is fine-tuned with Tamil common voice dataset. The test data released from the task is given to the proposed system, now the transcriptions are generated for the test samples and the generated transcriptions is submitted to the task. The result submitted is evaluated by task, the evaluation metric used is Word Error Rate (WER). Our Proposed system attained a WER of 39.8091%.
2022
pdf
abs
SUH_ASR@LT-EDI-ACL2022: Transformer based Approach for Speech Recognition for Vulnerable Individuals in Tamil
Suhasini S
|
Bharathi B
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
An Automatic Speech Recognition System is developed for addressing the Tamil conversational speech data of the elderly people andtransgender. The speech corpus used in this system is collected from the people who adhere their communication in Tamil at some primary places like bank, hospital, vegetable markets. Our ASR system is designed with pre-trained model which is used to recognize the speechdata. WER(Word Error Rate) calculation is used to analyse the performance of the ASR system. This evaluation could help to make acomparison of utterances between the elderly people and others. Similarly, the comparison between the transgender and other people isalso done. Our proposed ASR system achieves the word error rate as 39.65%.