Gijs Danoe
2026
Dr-BERT-NL at #SMM4H–HeaRD 2026: DOKTERBERT – Ontology-Grounded Contextual Representations for Dutch Clinical NLP
Gijs Danoe | Andreas Voss | Axel Hamprecht | Matthijs S. Berends
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
Gijs Danoe | Andreas Voss | Axel Hamprecht | Matthijs S. Berends
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks
We describe our submission to SMM4H-HeaRD 2026 Task 7, which asks systems tolabel ClinicalImpacts and SocialImpactsspans in Reddit posts about non-medical sub-stance use. We compare four pipeline shapesbuilt on the same DeBERTa-v3-base back-bone: (i) a direct 5-class encoder with a linear-chain CRF head, (ii) a two-stage detect-then-classify pipeline that delegates span typingto an instruction-tuned LLM (Qwen2.5-7Bor Gemma-3-12B, 4-bit NF4), (iii) an auditpipeline in which the same LLM verifies theencoder’s predictions, and (iv) a classical-MLvariant that replaces the LLM with an SVMtrained on encoder span embeddings. Across16 configurations, the encoder-only DeBERTa-v3 + CRF configuration is the strongest sin-gle system on the official test split, reaching45.4% strict and 54.2% relaxed F1 — +8.6/ +5.3 points above a mental-roberta-basebaseline. LLM audits give a small dev gain thatdoes not transfer to test.
2022
RUG-1-Pegasussers at SemEval-2022 Task 3: Data Generation Methods to Improve Recognizing Appropriate Taxonomic Word Relations
Frank van den Berg | Gijs Danoe | Esther Ploeger | Wessel Poelman | Lukas Edman | Tommaso Caselli
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Frank van den Berg | Gijs Danoe | Esther Ploeger | Wessel Poelman | Lukas Edman | Tommaso Caselli
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
This paper describes our system created for the SemEval 2022 Task 3: Presupposed Taxonomies - Evaluating Neural-network Semantics. This task is focused on correctly recognizing taxonomic word relations in English, French and Italian. We developed various datageneration techniques that expand the originally provided train set and show that all methods increase the performance of modelstrained on these expanded datasets. Our final system outperformed the baseline system from the task organizers by achieving an average macro F1 score of 79.6 on all languages, compared to the baseline’s 67.4.
2021
Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society
Firoj Alam | Shaden Shaar | Fahim Dalvi | Hassan Sajjad | Alex Nikolov | Hamdy Mubarak | Giovanni Da San Martino | Ahmed Abdelali | Nadir Durrani | Kareem Darwish | Abdulaziz Al-Homaid | Wajdi Zaghouani | Tommaso Caselli | Gijs Danoe | Friso Stolk | Britt Bruntink | Preslav Nakov
Findings of the Association for Computational Linguistics: EMNLP 2021
Firoj Alam | Shaden Shaar | Fahim Dalvi | Hassan Sajjad | Alex Nikolov | Hamdy Mubarak | Giovanni Da San Martino | Ahmed Abdelali | Nadir Durrani | Kareem Darwish | Abdulaziz Al-Homaid | Wajdi Zaghouani | Tommaso Caselli | Gijs Danoe | Friso Stolk | Britt Bruntink | Preslav Nakov
Findings of the Association for Computational Linguistics: EMNLP 2021
With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic. Fighting this infodemic has been declared one of the most important focus areas of the World Health Organization, with dangers ranging from promoting fake cures, rumors, and conspiracy theories to spreading xenophobia and panic. Addressing the issue requires solving a number of challenging problems such as identifying messages containing claims, determining their check-worthiness and factuality, and their potential to do harm as well as the nature of that harm, to mention just a few. To address this gap, we release a large dataset of 16K manually annotated tweets for fine-grained disinformation analysis that (i) focuses on COVID-19, (ii) combines the perspectives and the interests of journalists, fact-checkers, social media platforms, policy makers, and society, and (iii) covers Arabic, Bulgarian, Dutch, and English. Finally, we show strong evaluation results using pretrained Transformers, thus confirming the practical utility of the dataset in monolingual vs. multilingual, and single task vs. multitask settings.