Ryuki Ida


2023

pdf
Biomedical Document Classification with Literature Graph Representations of Bibliographies and Entities
Ryuki Ida | Makoto Miwa | Yutaka Sasaki
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

This paper proposes a new document classification method that incorporates the representations of a literature graph created from bibliographic and entity information. Recently, document classification performance has been significantly improved with large pre-trained language models; however, there still remain documents that are difficult to classify. External information, such as bibliographic information, citation links, descriptions of entities, and medical taxonomies, has been considered one of the keys to dealing with such documents in document classification. Although several document classification methods using external information have been proposed, they only consider limited relationships, e.g., word co-occurrence and citation relationships. However, there are multiple types of external information. To overcome the limitation of the conventional use of external information, we propose a document classification model that simultaneously considers bibliographic and entity information to deeply model the relationships among documents using the representations of the literature graph. The experimental results show that our proposed method outperforms existing methods on two document classification datasets in the biomedical domain with the help of the literature graph.