面向医学文本处理的医学实体标注规范(Medical Entity Annotation Standard for Medical Text Processing)
Huan Zhang (张欢), Yuan Zong (宗源), Baobao Chang (常宝宝), Zhifang Sui (穗志方), Hongying Zan (昝红英), Kunli Zhang (张坤丽)
Abstract
随着智慧医疗的普及,利用自然语言处理技术识别医学信息的需求日益增长。目前,针对医学实体而言,医学共享语料库仍处于空白状态,这对医学文本信息处理各项任务的进展造成了巨大阻力。如何判断不同的医学实体类别?如何界定不同实体间的涵盖范围?这些问题导致缺乏类似通用场景的大规模规范标注的医学文本数据。针对上述问题,该文参考了UMLS中定义的语义类型,提出面向医学文本信息处理的医学实体标注规范,涵盖了疾病、临床表现、医疗程序、医疗设备等9种医学实体,以及基于规范构建医学实体标注语料库。该文综述了标注规范的描述体系、分类原则、混淆处理、语料标注过程以及医学实体自动标注基线实验等相关问题,希望能为医学实体语料库的构建提供可参考的标注规范,以及为医学实体识别提供语料支持。- Anthology ID:
- 2020.ccl-1.52
- Volume:
- Proceedings of the 19th Chinese National Conference on Computational Linguistics
- Month:
- October
- Year:
- 2020
- Address:
- Haikou, China
- Editors:
- Maosong Sun (孙茂松), Sujian Li (李素建), Yue Zhang (张岳), Yang Liu (刘洋)
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 561–571
- Language:
- Chinese
- URL:
- https://aclanthology.org/2020.ccl-1.52
- DOI:
- Cite (ACL):
- Huan Zhang, Yuan Zong, Baobao Chang, Zhifang Sui, Hongying Zan, and Kunli Zhang. 2020. 面向医学文本处理的医学实体标注规范(Medical Entity Annotation Standard for Medical Text Processing). In Proceedings of the 19th Chinese National Conference on Computational Linguistics, pages 561–571, Haikou, China. Chinese Information Processing Society of China.
- Cite (Informal):
- 面向医学文本处理的医学实体标注规范(Medical Entity Annotation Standard for Medical Text Processing) (Zhang et al., CCL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.ccl-1.52.pdf