Neel Sabhahit

2026

NASIMLab at SemEval-2026 Task 9: A Comparative Analysis of Fine-Tuned Small Language Models vs. Generative Large Language Models for Multilingual Polarization Type Detection
Neel Sabhahit | Sanjeevan Selvaganapathy | Mehwish Nasim
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

The POLAR dataset contains various social media texts that might be polarized (conflict-inducing or dangerously divisive). The task at hand is to identify whether any of the following types of polarization are present: political, racial/ethnic, religious, gender/sexual, and other types across 22 languages. In this paper, we propose a system of fine-tuned language-specific small language models and compare our approach with state-of-the-art large language models on the POLAR dataset. By fine-tuning models for each language, we demonstrate that fine-tuned small encoder-only models consistently outperform large language models, especially for low-resource languages. Our system performs well on this task for most low-resource languages, notably taking the top spot on the leaderboard in Burmese (mya), appearing within the top 10 for 12 languages, and within the top 20 for all remaining languages.

Co-authors

Venues

SemEval1
WS1

Fix author