Abstract
We propose a distantly supervised pipeline NER which executes entity span detection and entity classification in sequence named DISTANT (DIstantly Supervised enTity spAN deTection and classification).The former entity span detector extracts possible entity mention spans by the distant supervision. Then the later entity classifier assigns each entity span to one of the positive entity types or none by employing a positive and unlabeled (PU) learning framework. Two models were built based on the pre-trained SciBERT model and fine-tuned with the silver corpus generated by the distant supervision. Experimental results on BC5CDR and NCBI-Disease datasets show that our method outperforms the end-to-end NER baselines without PU learning by a large margin. In particular, it increases the recall score effectively.- Anthology ID:
- 2023.bionlp-1.14
- Volume:
- The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Dina Demner-fushman, Sophia Ananiadou, Kevin Cohen
- Venue:
- BioNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 171–177
- Language:
- URL:
- https://aclanthology.org/2023.bionlp-1.14
- DOI:
- 10.18653/v1/2023.bionlp-1.14
- Cite (ACL):
- Ken Yano, Makoto Miwa, and Sophia Ananiadou. 2023. DISTANT: Distantly Supervised Entity Span Detection and Classification. In The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, pages 171–177, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- DISTANT: Distantly Supervised Entity Span Detection and Classification (Yano et al., BioNLP 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.bionlp-1.14.pdf