Leveraging Wikidata for Biomedical Entity Linking in a Low-Resource Setting: A Case Study for German

Faizan E Mustafa, Corina Dima, Juan Ochoa, Steffen Staab


Abstract
Biomedical Entity Linking (BEL) is a challenging task for low-resource languages, due to the lack of appropriate resources: datasets, knowledge bases (KBs), and pre-trained models. In this paper, we propose an approach to create a biomedical knowledge base for German BEL using UMLS information from Wikidata, that provides good coverage and can be easily extended to further languages. As a further contribution, we adapt several existing approaches for use in the German BEL setup, and report on their results. The chosen methods include a sparse model using character n-grams, a multilingual biomedical entity linker, and two general-purpose text retrieval models. Our results show that a language-specific KB that provides good coverage leads to most improvement in entity linking performance, irrespective of the used model. The finetuned German BEL model, newly created UMLSWikidata KB as well as the code to reproduce our results are publicly available.
Anthology ID:
2024.clinicalnlp-1.17
Volume:
Proceedings of the 6th Clinical Natural Language Processing Workshop
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Tristan Naumann, Asma Ben Abacha, Steven Bethard, Kirk Roberts, Danielle Bitterman
Venues:
ClinicalNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
202–207
Language:
URL:
https://aclanthology.org/2024.clinicalnlp-1.17
DOI:
Bibkey:
Cite (ACL):
Faizan E Mustafa, Corina Dima, Juan Ochoa, and Steffen Staab. 2024. Leveraging Wikidata for Biomedical Entity Linking in a Low-Resource Setting: A Case Study for German. In Proceedings of the 6th Clinical Natural Language Processing Workshop, pages 202–207, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
Leveraging Wikidata for Biomedical Entity Linking in a Low-Resource Setting: A Case Study for German (Mustafa et al., ClinicalNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.clinicalnlp-1.17.pdf