MIGBaseline at ROCLING 2022 Shared Task: Report on Named Entity Recognition Using Chinese Healthcare Datasets

Hsing-Yuan Ma, Wei-Jie Li, Chao-Lin Liu


Abstract
Named Entity Recognition (NER) tools have been in development for years, yet few have been aimed at medical documents. The increasing needs for analyzing medical data makes it crucial to build a sophisticated NER model for this missing area. In this paper, W2NER, the state-of-the-art NER model, which has excelled in English and Chinese tasks, is run through selected inputs, several pretrained language models, and training strategies. The objective was to build an NER model suitable for healthcare corpora in Chinese. The best model managed to achieve an F1 score at 81.93%, which ranked first in the ROCLING 2022 shared task.
Anthology ID:
2022.rocling-1.45
Volume:
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)
Month:
November
Year:
2022
Address:
Taipei, Taiwan
Editors:
Yung-Chun Chang, Yi-Chin Huang
Venue:
ROCLING
SIG:
Publisher:
The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Note:
Pages:
356–362
Language:
Chinese
URL:
https://aclanthology.org/2022.rocling-1.45
DOI:
Bibkey:
Cite (ACL):
Hsing-Yuan Ma, Wei-Jie Li, and Chao-Lin Liu. 2022. MIGBaseline at ROCLING 2022 Shared Task: Report on Named Entity Recognition Using Chinese Healthcare Datasets. In Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022), pages 356–362, Taipei, Taiwan. The Association for Computational Linguistics and Chinese Language Processing (ACLCLP).
Cite (Informal):
MIGBaseline at ROCLING 2022 Shared Task: Report on Named Entity Recognition Using Chinese Healthcare Datasets (Ma et al., ROCLING 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.rocling-1.45.pdf