Francisco J. Veredas
2026
ICB-UMA at #SMM4H–HeaRD 2026: Hybrid Clinical Entity Projection for MultiClinAI: Adaptive Candidate Windows, XGBoost, and LLM Refinement
Alvaro Rey-Blanes | Sara Giménez-Gómez | Francisco J. Veredas | Francisco J. Moreno-Barea
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Alvaro Rey-Blanes | Sara Giménez-Gómez | Francisco J. Veredas | Francisco J. Moreno-Barea
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
This paper presents our submission to the MultiClinAI Shared Task (Gallego-Donoso et al., 2026) on cross-lingual clinical entity annotation projection from Spanish to English. Our system transfers expert annotations for Diseases, Symptoms and Procedures entities. The approach integrates three core components: adaptive candidate window generation, an XGBoost classifier leveraging surface and semantic features, and an LLM-based post-processing stage to resolve complex misalignments. Our highest-performing run ranked 3rd on the official leaderboard, achieving strict F1 scores of 0.737, 0.549, and 0.538 for Diseases, Symptoms and Procedures, respectively. These results show that combining supervised candidate scoring with targeted LLM refinement provides a robust strategy for clinical entity projection.