Transfer Learning in Biomedical Named Entity Recognition: An Evaluation of BERT in the PharmaCoNER task

Cong Sun, Zhihao Yang


Abstract
To date, a large amount of biomedical content has been published in non-English texts, especially for clinical documents. Therefore, it is of considerable significance to conduct Natural Language Processing (NLP) research in non-English literature. PharmaCoNER is the first Named Entity Recognition (NER) task to recognize chemical and protein entities from Spanish biomedical texts. Since there have been abundant resources in the NLP field, how to exploit these existing resources to a new task to obtain competitive performance is a meaningful study. Inspired by the success of transfer learning with language models, we introduce the BERT benchmark to facilitate the research of PharmaCoNER task. In this paper, we evaluate two baselines based on Multilingual BERT and BioBERT on the PharmaCoNER corpus. Experimental results show that transferring the knowledge learned from source large-scale datasets to the target domain offers an effective solution for the PharmaCoNER task.
Anthology ID:
D19-5715
Volume:
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks
Month:
November
Year:
2019
Address:
Hong Kong, China
Venue:
BioNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
100–104
Language:
URL:
https://aclanthology.org/D19-5715
DOI:
10.18653/v1/D19-5715
Bibkey:
Cite (ACL):
Cong Sun and Zhihao Yang. 2019. Transfer Learning in Biomedical Named Entity Recognition: An Evaluation of BERT in the PharmaCoNER task. In Proceedings of the 5th Workshop on BioNLP Open Shared Tasks, pages 100–104, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Transfer Learning in Biomedical Named Entity Recognition: An Evaluation of BERT in the PharmaCoNER task (Sun & Yang, BioNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D19-5715.pdf