Stefana Gheorghita


2026

Polarization in online discourse poses significant challenges for natural language processing, particularly in multilingual and culturally diverse environments. In this paper, we address the SemEval-2026 POLAR shared task on multilingual polarization detection across 22 languages. We adopt a staged experimental strategy that first investigates the problem in a controlled monolingual English setting before extending the approach to multilingual modeling. Our system evaluates several transformer-based architectures, including RoBERTa, XLM-RoBERTa, MPNet, and mDeBERTa-v3, combined with techniques designed to mitigate class imbalance such as weighted loss functions, focal loss, and data augmentation using back-translation and large language models. Experimental results show that no single configuration consistently dominates across all languages. However, focal loss and augmentation frequently improve performance in languages with skewed label distributions. Our findings highlight the importance of contextual representations, imbalance-aware training strategies, and language-specific considerations for robust multilingual polarization detection.