Malliga Subramanian

2023

pdf abs
KEC_AI_NLP@DravidianLangTech: Abusive Comment Detection in Tamil Language
Kogilavani Shanmugavadivel | Malliga Subramanian | Shri Durga R | Srigha S | Sree Harene J S | Yasvanth Bala P
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages

Our work aims to identify the negative comments that is associated with Counter-speech,Xenophobia, Homophobia,Transphobia, Misandry, Misogyny, None-of-the-above categories, In order to identify these categories from the given dataset, we propose three different models such as traditional machine learning techniques, deep learning model and transfer Learning model called BERT is also used to analyze the texts. In the Tamil dataset, we are training the models with Train dataset and test the models with Validation data. Our Team Participated in the shared task organised by DravidianLangTech and secured 4th rank in the task of abusive comment detection in Tamil with a macro- f1 score of 0.35. Also, our run was submitted for abusive comment detection in code-mixed languages (Tamil-English) and secured 6th rank with a macro-f1 score of 0.42.

pdf abs
KEC_AI_NLP_DEP @ LT-EDI : Detecting Signs of Depression From Social Media Texts
Kogilavani Shanmugavadivel | Malliga Subramanian | Vasantharan K | Prethish Ga | Sankar S | Sabari S
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion

The goal of this study is to use machine learning approaches to detect depression indications in social media articles. Data gathering, pre-processing, feature extraction, model training, and performance evaluation are all aspects of the research. The collection consists of social media messages classified into three categories: not depressed, somewhat depressed, and severely depressed. The study contributes to the growing field of social media data-driven mental health analysis by stressing the use of feature extraction algorithms for obtaining relevant information from text data. The use of social media communications to detect depression has the potential to increase early intervention and help for people at risk. Several feature extraction approaches, such as TF-IDF, Count Vectorizer, and Hashing Vectorizer, are used to quantitatively represent textual data. These features are used to train and evaluate a wide range of machine learning models, including Logistic Regression, Random Forest, Decision Tree, Gaussian Naive Bayes, and Multinomial Naive Bayes. To assess the performance of the models, metrics such as accuracy, precision, recall, F1 score, and the confusion matrix are utilized. The Random Forest model with Count Vectorizer had the greatest accuracy on the development dataset, coming in at 92.99 percent. And with a macro F1-score of 0.362, we came in 19th position in the shared task. The findings show that machine learning is effective in detecting depression markers in social media articles.

2022

This paper presents the findings of the shared task on Multimodal Sentiment Analysis and Troll meme classification in Dravidian languages held at ACL 2022. Multimodal sentiment analysis deals with the identification of sentiment from video. In addition to video data, the task requires the analysis of corresponding text and audio features for the classification of movie reviews into five classes. We created a dataset for this task in Malayalam and Tamil. The Troll meme classification task aims to classify multimodal Troll memes into two categories. This task assumes the analysis of both text and image features for making better predictions. The performance of the participating teams was analysed using the F1-score. Only one team submitted their results in the Multimodal Sentiment Analysis task, whereas we received six submissions in the Troll meme classification task. The only team that participated in the Multimodal Sentiment Analysis shared task obtained an F1-score of 0.24. In the Troll meme classification task, the winning team achieved an F1-score of 0.596.

We present our findings from the first shared task on Multi-task Learning in Dravidian Languages at the second Workshop on Speech and Language Technologies for Dravidian Languages. In this task, a sentence in any of three Dravidian Languages is required to be classified into two closely related tasks namely Sentiment Analyis (SA) and Offensive Language Identification (OLI). The task spans over three Dravidian Languages, namely, Kannada, Malayalam, and Tamil. It is one of the first shared tasks that focuses on Multi-task Learning for closely related tasks, especially for a very low-resourced language family such as the Dravidian language family. In total, 55 people signed up to participate in the task, and due to the intricate nature of the task, especially in its first iteration, 3 submissions have been received.

The social media is one of the significantdigital platforms that create a huge im-pact in peoples of all levels. The commentsposted on social media is powerful enoughto even change the political and businessscenarios in very few hours. They alsotend to attack a particular individual ora group of individuals. This shared taskaims at detecting the abusive comments in-volving, Homophobia, Misandry, Counter-speech, Misogyny, Xenophobia, Transpho-bic. The hope speech is also identified. Adataset collected from social media taggedwith the above said categories in Tamiland Tamil-English code-mixed languagesare given to the participants. The par-ticipants used different machine learningand deep learning algorithms. This paperpresents the overview of this task compris-ing the dataset details and results of theparticipants.

pdf abs
Transformers at SemEval-2022 Task 5: A Feature Extraction based Approach for Misogynous Meme Detection
Shankar Mahadevan | Sean Benhur | Roshan Nayak | Malliga Subramanian | Kogilavani Shanmugavadivel | Kanchana Sivanraju | Bharathi Raja Chakravarthi
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Social media is an idea created to make theworld smaller and more connected. Recently,it has become a hub of fake news and sexistmemes that target women. Social Media shouldensure proper women’s safety and equality. Filteringsuch information from social media is ofparamount importance to achieving this goal. In this paper, we describe the system developedby our team for SemEval-2022 Task 5: MultimediaAutomatic Misogyny Identification. Wepropose a multimodal training methodologythat achieves good performance on both thesubtasks, ranking 4th for Subtask A (0.718macro F1-score) and 9th for Subtask B (0.695macro F1-score) while exceeding the baselineresults by good margins.