Abstract
To date, a large amount of biomedical content has been published in non-English texts, especially for clinical documents. Therefore, it is of considerable significance to conduct Natural Language Processing (NLP) research in non-English literature. PharmaCoNER is the first Named Entity Recognition (NER) task to recognize chemical and protein entities from Spanish biomedical texts. Since there have been abundant resources in the NLP field, how to exploit these existing resources to a new task to obtain competitive performance is a meaningful study. Inspired by the success of transfer learning with language models, we introduce the BERT benchmark to facilitate the research of PharmaCoNER task. In this paper, we evaluate two baselines based on Multilingual BERT and BioBERT on the PharmaCoNER corpus. Experimental results show that transferring the knowledge learned from source large-scale datasets to the target domain offers an effective solution for the PharmaCoNER task.- Anthology ID:
- D19-5715
- Volume:
- Proceedings of the 5th Workshop on BioNLP Open Shared Tasks
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Venue:
- BioNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 100–104
- Language:
- URL:
- https://aclanthology.org/D19-5715
- DOI:
- 10.18653/v1/D19-5715
- Cite (ACL):
- Cong Sun and Zhihao Yang. 2019. Transfer Learning in Biomedical Named Entity Recognition: An Evaluation of BERT in the PharmaCoNER task. In Proceedings of the 5th Workshop on BioNLP Open Shared Tasks, pages 100–104, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Transfer Learning in Biomedical Named Entity Recognition: An Evaluation of BERT in the PharmaCoNER task (Sun & Yang, BioNLP 2019)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/D19-5715.pdf