Guillermo Lopez-Garcia
2026
Overview of the 11th Social Media Mining for Health (#SMM4H) and Health Real-World Data (HeaRD) Shared Tasks at ACL 2026
Guillermo Lopez-Garcia | Jose Miguel Acitores Cortina | Jacob Berkowitz | Joey Chan | Sumon Kanti Dey | Ivan Flores Amaro | Fernando Gallego | Lauren Gryboski | Ari Z. Klein | Farnoush Zeidi Kolehparcheh | Martin Krallinger | Salvador Lima-Lopez | Yujun Ma | Tomohiro Nishiyama | Ahmad Rezaie Mianroodi | Amirali Rezaie Mianroodi | Lisa Raithel | Roland Roller | Judith Rosell | Frank Rudzicz | Abeed Sarker | Nicholas Tatonetti | Philippe Thomas | Elena Tutubalina | Dongfang Xu | Farnaz Zeidi | Yu Zhai | Pierre Zweigenbaum | Graciela Gonzalez-Hernandez
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Guillermo Lopez-Garcia | Jose Miguel Acitores Cortina | Jacob Berkowitz | Joey Chan | Sumon Kanti Dey | Ivan Flores Amaro | Fernando Gallego | Lauren Gryboski | Ari Z. Klein | Farnoush Zeidi Kolehparcheh | Martin Krallinger | Salvador Lima-Lopez | Yujun Ma | Tomohiro Nishiyama | Ahmad Rezaie Mianroodi | Amirali Rezaie Mianroodi | Lisa Raithel | Roland Roller | Judith Rosell | Frank Rudzicz | Abeed Sarker | Nicholas Tatonetti | Philippe Thomas | Elena Tutubalina | Dongfang Xu | Farnaz Zeidi | Yu Zhai | Pierre Zweigenbaum | Graciela Gonzalez-Hernandez
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
The aim of the Social Media Mining for Health Applications and Health Real-World Data (#SMM4H-HeaRD) shared tasks is to fos- ter the development and evaluation of natural language processing, machine learning, and artificial intelligence methods for analyzing health-related text from social media and other real-world data sources. For the 11th iteration, held online and co-located with ACL 2026, the workshop continued the expanded #SMM4H- HeaRD platform initiated in 2025, broaden-ing its scope beyond social media to include additional health real-world data sources such as clinical narratives and biomedical literature. The 8 shared tasks covered diverse data sources, health domains (e.g., adverse drug events, insomnia, influenza vaccine effectiveness, cancer staging, substance use), and task formulations (e.g., classification, named entity recognition, span extraction, and text generation). In total, 110 teams registered, representing 31 countries. In this paper, we present an overview of the datasets, participant systems, and performance results, providing insights into current methods for mining social media and health real-world data for biomedical and clinical applications.
Overview of #SMM4H-HeaRD 2026 - Task 2: Detection of Insomnia in Clinical Notes
Joey Chan | Lauren D. Gryboski | Guillermo Lopez-Garcia | Graciela Gonzalez-Hernandez
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Joey Chan | Lauren D. Gryboski | Guillermo Lopez-Garcia | Graciela Gonzalez-Hernandez
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
This paper provides an overview of Task 2 from the Social Media Mining for Health and Health Real-World Data (#SMM4H-HeaRD) 2026 Workshop and Shared Tasks, which focused on the detection of insomnia in clinical notes derived from the MIMIC-III dataset. The task consisted of two subtasks: binary text classification to determine whether a patient is likely experiencing insomnia (Subtask 1), and multi-label classification combined with character-level evidence extraction to identify supporting evidence for specific insomnia crite- ria (Subtask 2). Eight teams participated, using approaches ranging from large language model (LLM) prompting and fine-tuned encoder mod- els to hybrid rule-based pipelines. Results demonstrated that structured LLM pipelines with deterministic post-processing achieved the strongest overall performance, while character-level span extraction remained substantially harder than classification across all systems. These findings highlight both the promise of NLP for identifying underdiagnosed conditions in electronic health records and the ongoing difficulty of producing interpretable, evidence-grounded clinical predictions.
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Guillermo Lopez-Garcia | Graciela Gonzalez-Hernandez
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Guillermo Lopez-Garcia | Graciela Gonzalez-Hernandez
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Search
Fix author
Co-authors
- Graciela Gonzalez-Hernandez 3
- Joey Chan 2
- Jacob Berkowitz 1
- Jose Cortina 1
- Sumon Kanti Dey 1
- Ivan Flores Amaro 1
- Fernando Gallego 1
- Lauren Gryboski 1
- Lauren D. Gryboski 1
- Ari Z. Klein 1
- Martin Krallinger 1
- Salvador Lima-Lopez 1
- Yujun Ma 1
- Tomohiro Nishiyama 1
- Lisa Raithel 1
- Ahmad Rezaie Mianroodi 1
- Amirali Rezaie Mianroodi 1
- Roland Roller 1
- Judith Rosell 1
- Frank Rudzicz 1
- Abeed Sarker 1
- Nicholas Tatonetti 1
- Philippe Thomas 1
- Elena Tutubalina 1
- Dongfang Xu 1
- Farnaz Zeidi 1
- Farnoush Zeidi Kolehparcheh 1
- Yu Zhai 1
- Pierre Zweigenbaum 1