Correlation Analysis of Chronic Obstructive Pulmonary Disease (COPD) and its Biomarkers Using the Word Embeddings

Byeong-Hun Yoon, Yu-Seop Kim


Abstract
It is very costly and time consuming to find new biomarkers for specific diseases in clinical laboratories. In this study, to find new biomarkers most closely related to Chronic Obstructive Pulmonary Disease (COPD), which is widely known as respiratory disease, biomarkers known to be associated with respiratory diseases and COPD itself were converted into word embedding. And their similarities were measured. We used Word2Vec, Canonical Correlation Analysis (CCA), and Global Vector (GloVe) for word embedding. In order to replace the clinical evaluation, the titles and abstracts of papers retrieved from Google Scholars were analyzed and quantified to estimate the performance of the word em-bedding models.
Anthology ID:
I17-2057
Volume:
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)
Month:
November
Year:
2017
Address:
Taipei, Taiwan
Editors:
Greg Kondrak, Taro Watanabe
Venue:
IJCNLP
SIG:
Publisher:
Asian Federation of Natural Language Processing
Note:
Pages:
337–342
Language:
URL:
https://aclanthology.org/I17-2057
DOI:
Bibkey:
Cite (ACL):
Byeong-Hun Yoon and Yu-Seop Kim. 2017. Correlation Analysis of Chronic Obstructive Pulmonary Disease (COPD) and its Biomarkers Using the Word Embeddings. In Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 337–342, Taipei, Taiwan. Asian Federation of Natural Language Processing.
Cite (Informal):
Correlation Analysis of Chronic Obstructive Pulmonary Disease (COPD) and its Biomarkers Using the Word Embeddings (Yoon & Kim, IJCNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/I17-2057.pdf