Alicia Ramirez-Arrabe
2026
LSI_UNED at #SMM4H–HeaRD 2026: Grid-Based Biomedical Named Entity Recognition Across Languages and Entity Types
Alicia Ramirez-Arrabe | Juan Martinez-Romo | Andres Duque
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Alicia Ramirez-Arrabe | Juan Martinez-Romo | Andres Duque
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
This paper describes the participation of the LSI_UNED team in the firt sub-task of MultiClinAI at the #SMM4H-HeaRD 2026 Workshop, which focuses on multilingual clinical named entity recognition in seven languages. The task requires identifying mentions of diseases, procedures, and symptoms in clinical case reports. We propose a set of systems based on the W2NER architecture, with a separate model trained for each language and entity type. For Spanish, we use a RoBERTa-based model with data augmentation from additional NER resources, while English and Italian systems are based on different biomedical BERT variants. Results show consistent performance across languages, with the best overall results obtained for Spanish. Data augmentation improves recall and F1, while English and Italian models achieve competitive but slightly lower scores. Symptom recognition remains the most challenging entity type across all languages.