Team TIET at #SMM4H-HeaRD 2026: Fine-tuned Biomedical Transformers with Language-Balanced Sampling for Patient Metadata and Multilingual ADE Detection

Divrose Kaur, Jatin Bedi, Jasmeet Singh


Abstract
We present Team TIET’s systems for two shared tasks at #SMM4H-HeaRD 2026: Task 5 (detection of patient metadata in SARS-CoV-2 sequencing papers) and Task 1 (multilingual adverse drug event detection across six languages plus an unseen Farsi subset). For Task 5 we explore iterative LLM prompting followed by fine-tuning BiomedBERT-base with weighted cross-entropy loss and probability threshold optimization, achieving F1 = 0.760 on the official test set (above the competition mean of 0.729). For Task 1 we fine-tune XLM-RoBERTa-base with a combined language- and class-balanced sampling strategy and per-language threshold tuning, achieving macro F1 = 0.497 overall (0.608 excluding the unseen Farsi subset). We report empirical findings on BERT+LLM ensemble failure with bimodal probability distributions, the superiority of base over large model variants under limited data, and the importance of language-balanced gradient contribution in multilingual classification.
Anthology ID:
2026.smm4h-1.44
Volume:
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Month:
July
Year:
2026
Address:
San Diego, United States
Editors:
Guillermo Lopez-Garcia, Graciela Gonzalez-Hernandez
Venues:
SMM4H | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
268–271
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.smm4h-1.44/
DOI:
Bibkey:
Cite (ACL):
Divrose Kaur, Jatin Bedi, and Jasmeet Singh. 2026. Team TIET at #SMM4H-HeaRD 2026: Fine-tuned Biomedical Transformers with Language-Balanced Sampling for Patient Metadata and Multilingual ADE Detection. In Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks, pages 268–271, San Diego, United States. Association for Computational Linguistics.
Cite (Informal):
Team TIET at #SMM4H-HeaRD 2026: Fine-tuned Biomedical Transformers with Language-Balanced Sampling for Patient Metadata and Multilingual ADE Detection (Kaur et al., SMM4H 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.smm4h-1.44.pdf