Comparing the Intrinsic Performance of Clinical Concept Embeddings by Their Field of Medicine

John-Jose Nunez, Giuseppe Carenini


Abstract
Pre-trained word embeddings are becoming increasingly popular for natural language processing tasks. This includes medical applications, where embeddings are trained for clinical concepts using specific medical data. Recent work continues to improve on these embeddings. However, no one has yet sought to determine whether these embeddings work as well for one field of medicine as they do in others. In this work, we use intrinsic methods to evaluate embeddings from the various fields of medicine as defined by their ICD-9 systems. We find significant differences between fields, and motivate future work to investigate whether extrinsic tasks will follow a similar pattern.
Anthology ID:
D19-6202
Volume:
Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019)
Month:
November
Year:
2019
Address:
Hong Kong
Editors:
Eben Holderness, Antonio Jimeno Yepes, Alberto Lavelli, Anne-Lyse Minard, James Pustejovsky, Fabio Rinaldi
Venue:
Louhi
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11–17
Language:
URL:
https://aclanthology.org/D19-6202
DOI:
10.18653/v1/D19-6202
Bibkey:
Cite (ACL):
John-Jose Nunez and Giuseppe Carenini. 2019. Comparing the Intrinsic Performance of Clinical Concept Embeddings by Their Field of Medicine. In Proceedings of the Tenth International Workshop on Health Text Mining and Information Analysis (LOUHI 2019), pages 11–17, Hong Kong. Association for Computational Linguistics.
Cite (Informal):
Comparing the Intrinsic Performance of Clinical Concept Embeddings by Their Field of Medicine (Nunez & Carenini, Louhi 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/D19-6202.pdf