Dan Dodun-Des-Perrieres

Also published as: Dan Dodun-des-Perrieres

2026

StanceLab at SemEval-2026 Task 9: Addressing Class Imbalance in Multilingual Polarization Detection
Teodor Ivanusca | Dan Dodun-Des-Perrieres | Stefana Gheorghita
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

Polarization in online discourse poses significant challenges for natural language processing, particularly in multilingual and culturally diverse environments. In this paper, we address the SemEval-2026 POLAR shared task on multilingual polarization detection across 22 languages. We adopt a staged experimental strategy that first investigates the problem in a controlled monolingual English setting before extending the approach to multilingual modeling. Our system evaluates several transformer-based architectures, including RoBERTa, XLM-RoBERTa, MPNet, and mDeBERTa-v3, combined with techniques designed to mitigate class imbalance such as weighted loss functions, focal loss, and data augmentation using back-translation and large language models. Experimental results show that no single configuration consistently dominates across all languages. However, focal loss and augmentation frequently improve performance in languages with skewed label distributions. Our findings highlight the importance of contextual representations, imbalance-aware training strategies, and language-specific considerations for robust multilingual polarization detection.

pdf bib abs

asetclarity at SemEval-2026 Task 6: An Imbalance-Aware RoBERTa Cross-Encoder for Political Response Clarity Classification
Maria-Antonia-Emanuela Pascu | Dan Dodun-des-Perrieres | Daniela Gifu
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We address response-clarity classification in political interviews as defined in SemEval-2026 Task 6: CLARITY - Unmasking Political Question Evasions, Task 1, where systems must label each question–answer pair as Clear Reply, Ambivalent, or Clear Non-Reply. We present a reproducible end-to-end pipeline built around a single-stream RoBERTa-large cross-encoder fine-tuned for three-way classification using deterministic text normalization, concatenated QA inputs, and imbalance-aware training (minority oversampling and class-weighted loss). To improve robustness, we train a 5-fold stratified ensemble and combine models via soft-voting. Our official shared-task submission obtains 0.76 macro-F1 on the official leaderboard, ranking 16 out of 41 participating systems. Finally, we deploy the classifier in a lightweight web application supporting both direct text input and audio-based analysis through automatic transcription, enabling interactive inspection of predicted clarity categories.

Co-authors

Venues

SemEval2
WS2

Fix author