Biomedical Named Entity Recognition with Multilingual BERT

Kai Hakala, Sampo Pyysalo


Abstract
We present the approach of the Turku NLP group to the PharmaCoNER task on Spanish biomedical named entity recognition. We apply a CRF-based baseline approach and multilingual BERT to the task, achieving an F-score of 88% on the development data and 87% on the test set with BERT. Our approach reflects a straightforward application of a state-of-the-art multilingual model that is not specifically tailored to either the language nor the application domain. The source code is available at: https://github.com/chaanim/pharmaconer
Anthology ID:
D19-5709
Volume:
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kim Jin-Dong, Nédellec Claire, Bossy Robert, Deléger Louise
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
56–61
Language:
URL:
https://aclanthology.org/D19-5709
DOI:
10.18653/v1/D19-5709
Bibkey:
Cite (ACL):
Kai Hakala and Sampo Pyysalo. 2019. Biomedical Named Entity Recognition with Multilingual BERT. In Proceedings of the 5th Workshop on BioNLP Open Shared Tasks, pages 56–61, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Biomedical Named Entity Recognition with Multilingual BERT (Hakala & Pyysalo, BioNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/D19-5709.pdf
Code
 chaanim/pharmaconer