Gladiators at #SMM4H–HeaRD 2026: Multi-Seed XLM-RoBERTa Ensemble with Focal Loss and Per-Language Threshold Optimization for Multilingual Adverse Drug Event Detection

Ankit Kumar Singh

Gladiators at #SMM4H–HeaRD 2026: Multi-Seed XLM-RoBERTa Ensemble with Focal Loss and Per-Language Threshold Optimization for Multilingual Adverse Drug Event Detection

Abstract

This paper describes the Gladiators system for Task 1 of the SMM4H 2026 shared task on binary classification of adverse drug event (ADE) mentions in multilingual social media posts. Our system fine-tunes three XLM-RoBERTa large models with different random seeds using focal loss (α=0.75, γ=2.0) and 3× positive oversampling, then averages their predicted probabilities and applies per-language threshold optimization. On the development set, our ensemble achieves a pooled binary F1 of 0.7505. On the official test set—which introduced surprise Farsi comprising 35.5% of samples—our system achieves F1 = 0.6039, above the competition mean (0.5465) and median (0.5798). We evaluated eleven approaches and document key negative results. Post evaluation, a six-model cross-regime ensembleimproved dev F1 to 0.7585.

Anthology ID:: 2026.smm4h-1.2
Volume:: Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Month:: July
Year:: 2026
Address:: San Diego, United States
Editors:: Guillermo Lopez-Garcia, Graciela Gonzalez-Hernandez
Venues:: SMM4H | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7–11
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.smm4h-1.2/
DOI:
Bibkey:
Cite (ACL):: Ankit Kumar Singh. 2026. Gladiators at #SMM4H–HeaRD 2026: Multi-Seed XLM-RoBERTa Ensemble with Focal Loss and Per-Language Threshold Optimization for Multilingual Adverse Drug Event Detection. In Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks, pages 7–11, San Diego, United States. Association for Computational Linguistics.
Cite (Informal):: Gladiators at #SMM4H–HeaRD 2026: Multi-Seed XLM-RoBERTa Ensemble with Focal Loss and Per-Language Threshold Optimization for Multilingual Adverse Drug Event Detection (Singh, SMM4H 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.smm4h-1.2.pdf

PDF Cite Search Fix data