2025
pdf
bib
abs
Overview of the Shared Task on Sentiment Analysis in Tamil and Tulu
Thenmozhi Durairaj
|
Bharathi Raja Chakravarthi
|
Asha Hegde
|
Hosahalli Lakshmaiah Shashirekha
|
Rajeswari Natarajan
|
Sajeetha Thavareesan
|
Ratnasingam Sakuntharaj
|
Krishnakumari K
|
Charmathi Rajkumar
|
Poorvi Shetty
|
Harshitha S Kumar
Proceedings of the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Sentiment analysis is an essential task for interpreting subjective opinions and emotions in textual data, with significant implications across commercial and societal applications. This paper provides an overview of the shared task on Sentiment Analysis in Tamil and Tulu, organized as part of DravidianLangTech@NAACL 2025. The task comprises two components: one addressing Tamil and the other focusing on Tulu, both designed as multi-class classification challenges, wherein the sentiment of a given text must be categorized as positive, negative, neutral and unknown. The dataset was diligently organized by aggregating user-generated content from social media platforms such as YouTube and Twitter, ensuring linguistic diversity and real-world applicability. Participants applied a variety of computational approaches, ranging from classical machine learning algorithms such as Traditional Machine Learning Models, Deep Learning Models, Pre-trained Language Models and other Feature Representation Techniques to tackle the challenges posed by linguistic code-mixing, orthographic variations, and resource scarcity in these low resource languages.
2024
pdf
bib
abs
Overview of Third Shared Task on Homophobia and Transphobia Detection in Social Media Comments
Bharathi Raja Chakravarthi
|
Prasanna Kumaresan
|
Ruba Priyadharshini
|
Paul Buitelaar
|
Asha Hegde
|
Hosahalli Shashirekha
|
Saranya Rajiakodi
|
Miguel Ángel García
|
Salud María Jiménez-Zafra
|
José García-Díaz
|
Rafael Valencia-García
|
Kishore Ponnusamy
|
Poorvi Shetty
|
Daniel García-Baena
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
This paper provides a comprehensive summary of the “Homophobia and Transphobia Detection in Social Media Comments” shared task, which was held at the LT-EDI@EACL 2024. The objective of this task was to develop systems capable of identifying instances of homophobia and transphobia within social media comments. This challenge was extended across ten languages: English, Tamil, Malayalam, Telugu, Kannada, Gujarati, Hindi, Marathi, Spanish, and Tulu. Each comment in the dataset was annotated into three categories. The shared task attracted significant interest, with over 60 teams participating through the CodaLab platform. The submission of prediction from the participants was evaluated with the macro F1 score.
2023
pdf
bib
abs
Poorvi@DravidianLangTech: Sentiment Analysis on Code-Mixed Tulu and Tamil Corpus
Poorvi Shetty
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages
Sentiment analysis in code-mixed languages poses significant challenges, particularly for highly under-resourced languages such as Tulu and Tamil. Existing corpora, primarily sourced from YouTube comments, suffer from class imbalance across sentiment categories. Moreover, the limited number of samples in these corpus hampers effective sentiment classification. This study introduces a new corpus tailored for sentiment analysis in Tulu code-mixed texts. The research applies standard pre-processing techniques to ensure data quality and consistency and handle class imbalance. Subsequently, multiple classifiers are employed to analyze the sentiment of the code-mixed texts, yielding promising results. By leveraging the new corpus, the study contributes to advancing sentiment analysis techniques in under-resourced code-mixed languages. This work serves as a stepping stone towards better understanding and addressing the challenges posed by sentiment analysis in highly under-resourced languages.