ICDBigBird: A Contextual Embedding Model for ICD Code Classification
George Michalopoulos, Michal Malyska, Nicola Sahar, Alexander Wong, Helen Chen
Abstract
The International Classification of Diseases (ICD) system is the international standard for classifying diseases and procedures during a healthcare encounter and is widely used for healthcare reporting and management purposes. Assigning correct codes for clinical procedures is important for clinical, operational and financial decision-making in healthcare. Contextual word embedding models have achieved state-of-the-art results in multiple NLP tasks. However, these models have yet to achieve state-of-the-art results in the ICD classification task since one of their main disadvantages is that they can only process documents that contain a small number of tokens which is rarely the case with real patient notes. In this paper, we introduce ICDBigBird a BigBird-based model which can integrate a Graph Convolutional Network (GCN), that takes advantage of the relations between ICD codes in order to create ‘enriched’ representations of their embeddings, with a BigBird contextual model that can process larger documents. Our experiments on a real-world clinical dataset demonstrate the effectiveness of our BigBird-based model on the ICD classification task as it outperforms the previous state-of-the-art models.- Anthology ID:
- 2022.bionlp-1.32
- Volume:
- Proceedings of the 21st Workshop on Biomedical Language Processing
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Dina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, Junichi Tsujii
- Venue:
- BioNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 330–336
- Language:
- URL:
- https://aclanthology.org/2022.bionlp-1.32
- DOI:
- 10.18653/v1/2022.bionlp-1.32
- Cite (ACL):
- George Michalopoulos, Michal Malyska, Nicola Sahar, Alexander Wong, and Helen Chen. 2022. ICDBigBird: A Contextual Embedding Model for ICD Code Classification. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 330–336, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- ICDBigBird: A Contextual Embedding Model for ICD Code Classification (Michalopoulos et al., BioNLP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2022.bionlp-1.32.pdf