Sowmya Anand

2026

DataBees at SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization
Tanisha Sriram | Sathvika Shankar | Sowmya Anand | Rajalakshmi Sivanaiah | Angel Deborah S | Mirnalinee Thankanadar
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

This paper describes our submission toSemEval-2026 Task 9, Subtask 1: Multilingual Text Classification Challenge — Polarization Detection. Our focus is on how classicaland transformer-based models compare whenapplied to multilingual polarization detection.We aim to understand where each type tendsto do well and where it breaks down, particularly once you move from high-resource tolow-resource settings. Our experimental setupevaluates classical machine learning models(TFIDF with Naive Bayes, Logistic Regression, and Linear SVM) alongside languagespecific transformer models across multiplelanguages. For Arabic, Bengali, German, Italian, and Spanish, we leveraged both multilingual and monolingual pre-trained transformers such as mBERT, XLM-R, AraBERTv2,BanglaBERT, and BETO. We compare individual classical and transformer-based modelsto identify which modeling choices work bestfor each language. Our results varied substantially across languages. We achieved our bestleaderboard rankings in Bengali (6th out of 48teams) and Italian (6th out of 43 teams), whileperformance was lower in Arabic (33rd out of44), German (41st out of 44), and Spanish (46thout of 48). The study highlights the value ofcomparing classical and transformer-based approaches for multilingual polarization detectionand identifies language-specific challenges forfuture improvement.

2025

pdf bib abs

DataBees at SemEval-2025 Task 11: Challenges and Limitations in Multi-Label Emotion Detection
Sowmya Anand | Tanisha Sriram | Rajalakshmi Sivanaiah | Angel Deborah S | Mirnalinee Thankanadar
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Text-based emotion detection is crucial in NLP,with applications in sentiment analysis, socialmedia monitoring, and human-computer interaction. This paper presents our approach tothe Multi-label Emotion Detection challenge,classifying texts into joy, sadness, anger, fear,and surprise. We experimented with traditionalmachine learning and transformer-based models, but results were suboptimal: F1 scores of0.3723 (English), 0.5174 (German), and 0.6957(Spanish). We analyze the impact of preprocessing, model selection, and dataset characteristics, highlighting key challenges in multilabel emotion classification and potential improvements.

Co-authors

Venues

SemEval2
WS2

Fix author