Neel Sabhahit
2026
NASIMLab at SemEval-2026 Task 9: A Comparative Analysis of Fine-Tuned Small Language Models vs. Generative Large Language Models for Multilingual Polarization Type Detection
Neel Sabhahit | Sanjeevan Selvaganapathy | Mehwish Nasim
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Neel Sabhahit | Sanjeevan Selvaganapathy | Mehwish Nasim
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
The POLAR dataset contains various social media texts that might be polarized (conflict-inducing or dangerously divisive). The task at hand is to identify whether any of the following types of polarization are present: political, racial/ethnic, religious, gender/sexual, and other types across 22 languages. In this paper, we propose a system of fine-tuned language-specific small language models and compare our approach with state-of-the-art large language models on the POLAR dataset. By fine-tuning models for each language, we demonstrate that fine-tuned small encoder-only models consistently outperform large language models, especially for low-resource languages. Our system performs well on this task for most low-resource languages, notably taking the top spot on the leaderboard in Burmese (mya), appearing within the top 10 for 12 languages, and within the top 20 for all remaining languages.