Hongping Shu


一种非结构化数据表征增强的术后风险预测模型(An Unstructured Data Representation Enhanced Model for Postoperative Risk Prediction)
Yaqiang Wang (王亚强) | Xiao Yang (杨潇) | Xuechao Hao (郝学超) | Hongping Shu (舒红平) | Guo Chen (陈果) | Tao Zhu (朱涛)
Proceedings of the 21st Chinese National Conference on Computational Linguistics


基于批数据过采样的中医临床记录四诊描述抽取方法(Four Diagnostic Description Extraction in Clinical Records of Traditional Chinese Medicine with Batch Data Oversampling)
Yaqiang Wang (王亚强) | Kailun Li (李凯伦) | Yongguang Jiang (蒋永光) | Hongping Shu (舒红平)
Proceedings of the 21st Chinese National Conference on Computational Linguistics



On Learning Better Embeddings from Chinese Clinical Records: Study on Combining In-Domain and Out-Domain Data
Yaqiang Wang | Yunhui Chen | Hongping Shu | Yongguang Jiang
Proceedings of the BioNLP 2018 workshop

High quality word embeddings are of great significance to advance applications of biomedical natural language processing. In recent years, a surge of interest on how to learn good embeddings and evaluate embedding quality based on English medical text has become increasing evident, however a limited number of studies based on Chinese medical text, particularly Chinese clinical records, were performed. Herein, we proposed a novel approach of improving the quality of learned embeddings using out-domain data as a supplementary in the case of limited Chinese clinical records. Moreover, the embedding quality evaluation method was conducted based on Medical Conceptual Similarity Property. The experimental results revealed that selecting good training samples was necessary, and collecting right amount of out-domain data and trading off between the quality of embeddings and the training time consumption were essential factors for better embeddings.