2022
pdf
abs
Findings of the Shared Task on Emotion Analysis in Tamil
Anbukkarasi Sampath
|
Thenmozhi Durairaj
|
Bharathi Raja Chakravarthi
|
Ruba Priyadharshini
|
Subalalitha Cn
|
Kogilavani Shanmugavadivel
|
Sajeetha Thavareesan
|
Sathiyaraj Thangasamy
|
Parameswari Krishnamurthy
|
Adeep Hande
|
Sean Benhur
|
Kishore Ponnusamy
|
Santhiya Pandiyan
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
This paper presents the overview of the shared task on emotional analysis in Tamil. The result of the shared task is presented at the workshop. This paper presents the dataset used in the shared task, task description, and the methodology used by the participants and the evaluation results of the submission. This task is organized as two Tasks. Task A is carried with 11 emotions annotated data for social media comments in Tamil and Task B is organized with 31 fine-grained emotion annotated data for social media comments in Tamil. For conducting experiments, training and development datasets were provided to the participants and results are evaluated for the unseen data. Totally we have received around 24 submissions from 13 teams. For evaluating the models, Precision, Recall, micro average metrics are used.
pdf
abs
Findings of the Shared Task on Multi-task Learning in Dravidian Languages
Bharathi Raja Chakravarthi
|
Ruba Priyadharshini
|
Subalalitha Cn
|
Sangeetha S
|
Malliga Subramanian
|
Kogilavani Shanmugavadivel
|
Parameswari Krishnamurthy
|
Adeep Hande
|
Siddhanth U Hegde
|
Roshan Nayak
|
Swetha Valli
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
We present our findings from the first shared task on Multi-task Learning in Dravidian Languages at the second Workshop on Speech and Language Technologies for Dravidian Languages. In this task, a sentence in any of three Dravidian Languages is required to be classified into two closely related tasks namely Sentiment Analyis (SA) and Offensive Language Identification (OLI). The task spans over three Dravidian Languages, namely, Kannada, Malayalam, and Tamil. It is one of the first shared tasks that focuses on Multi-task Learning for closely related tasks, especially for a very low-resourced language family such as the Dravidian language family. In total, 55 people signed up to participate in the task, and due to the intricate nature of the task, especially in its first iteration, 3 submissions have been received.
pdf
abs
Overview of Abusive Comment Detection in Tamil-ACL 2022
Ruba Priyadharshini
|
Bharathi Raja Chakravarthi
|
Subalalitha Cn
|
Thenmozhi Durairaj
|
Malliga Subramanian
|
Kogilavani Shanmugavadivel
|
Siddhanth U Hegde
|
Prasanna Kumaresan
Proceedings of the Second Workshop on Speech and Language Technologies for Dravidian Languages
The social media is one of the significantdigital platforms that create a huge im-pact in peoples of all levels. The commentsposted on social media is powerful enoughto even change the political and businessscenarios in very few hours. They alsotend to attack a particular individual ora group of individuals. This shared taskaims at detecting the abusive comments in-volving, Homophobia, Misandry, Counter-speech, Misogyny, Xenophobia, Transpho-bic. The hope speech is also identified. Adataset collected from social media taggedwith the above said categories in Tamiland Tamil-English code-mixed languagesare given to the participants. The par-ticipants used different machine learningand deep learning algorithms. This paperpresents the overview of this task compris-ing the dataset details and results of theparticipants.
pdf
abs
Findings of the Shared Task on Speech Recognition for Vulnerable Individuals in Tamil
Bharathi B
|
Bharathi Raja Chakravarthi
|
Subalalitha Cn
|
Sripriya N
|
Arunaggiri Pandian
|
Swetha Valli
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
This paper illustrates the overview of the sharedtask on automatic speech recognition in the Tamillanguage. In the shared task, spontaneousTamil speech data gathered from elderly andtransgender people was given for recognitionand evaluation. These utterances were collected from people when they communicatedin the public locations such as hospitals, markets, vegetable shop, etc. The speech corpusincludes utterances of male, female, and transgender and was split into training and testingdata. The given task was evaluated using WER(Word Error Rate). The participants used thetransformer-based model for automatic speechrecognition. Different results using differentpre-trained transformer models are discussedin this overview paper.
pdf
abs
Overview of the Shared Task on Hope Speech Detection for Equality, Diversity, and Inclusion
Bharathi Raja Chakravarthi
|
Vigneshwaran Muralidaran
|
Ruba Priyadharshini
|
Subalalitha Cn
|
John McCrae
|
Miguel Ángel García
|
Salud María Jiménez-Zafra
|
Rafael Valencia-García
|
Prasanna Kumaresan
|
Rahul Ponnusamy
|
Daniel García-Baena
|
José García-Díaz
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
Hope Speech detection is the task of classifying a sentence as hope speech or non-hope speech given a corpus of sentences. Hope speech is any message or content that is positive, encouraging, reassuring, inclusive and supportive that inspires and engenders optimism in the minds of people. In contrast to identifying and censoring negative speech patterns, hope speech detection is focussed on recognising and promoting positive speech patterns online. In this paper, we report an overview of the findings and results from the shared task on hope speech detection for Tamil, Malayalam, Kannada, English and Spanish languages conducted in the second workshop on Language Technology for Equality, Diversity and Inclusion (LT-EDI-2022) organised as a part of ACL 2022. The participants were provided with annotated training & development datasets and unlabelled test datasets in all the five languages. The goal of the shared task is to classify the given sentences into one of the two hope speech classes. The performances of the systems submitted by the participants were evaluated in terms of micro-F1 score and weighted-F1 score. The datasets for this challenge are openly available
2021
pdf
abs
Analysis of Uvama Urubugal in Tamil Sangam Literatures
Subalalitha Cn
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages
Uvama urubugal in Tamil are used to explain a particular context by citing another equivalent context. This is referred to as “Uvamaiyani” in Tamil Grammar rules as stated in Tholkappiam. The is called as simile in English. Similes bring out many beautiful poetic contexts. Automatic extraction of such similes can help to build better Natural Language Generation applications such as, story generation systems and lyric suggestion systems. This paper attempts to automatically extract the uvama urubugal from Tamil Sangam Literatures. Natrinai and Mullai Pattu have been used for the analysis. There are 12 uvama urupugal in Tamil as per Nanool and this paper has attempted to analyze the usage of these 12 uvama urubugal in Sangam Literatures and compares their usage distribution in the Tamil Film songs data set comprising of 4215 songs. It was found that only two uvama urubugal were used in the current-day Tamil Film songs. This comparison was done to reveal the diminishing usage of these beautiful uvama urubugal by the current generation and the urge to use them again.