This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
YukiTagawa
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
This study addresses two key challenges in structuring radiology reports: the lack of a practical structuring schema and datasets to evaluate model generalizability. To address these challenges, we propose a “Finding-Centric Structuring,” which organizes reports around individual findings, facilitating secondary use. We also construct JRadFCS, a large-scale dataset with annotated named entities (NEs) and relations, comprising 8,428 Japanese Computed Tomography (CT) reports from seven facilities, providing a comprehensive resource for evaluating model generalizability. Our experiments reveal performance gaps when applying models trained on single-facility reports to those from other facilities. We further analyze factors contributing to these gaps and demonstrate that augmenting the training set based on these performance-correlated factors can efficiently enhance model generalizability.
Automated generation of medical reports that describe the findings in the medical images helps radiologists by alleviating their workload. Medical report generation system should generate correct and concise reports. However, data imbalance makes it difficult to train models accurately. Medical datasets are commonly imbalanced in their finding labels because incidence rates differ among diseases; moreover, the ratios of abnormalities to normalities are significantly imbalanced. We propose a novel reinforcement learning method with a reconstructor to improve the clinical correctness of generated reports to train the data-to-text module with a highly imbalanced dataset. Moreover, we introduce a novel data augmentation strategy for reinforcement learning to additionally train the model on infrequent findings. From the perspective of a practical use, we employ a Two-Stage Medical Report Generator (TS-MRGen) for controllable report generation from input images. TS-MRGen consists of two separated stages: an image diagnosis module and a data-to-text module. Radiologists can modify the image diagnosis module results to control the reports that the data-to-text module generates. We conduct an experiment with two medical datasets to assess the data-to-text module and the entire two-stage model. Results demonstrate that the reports generated by our model describe the findings in the input image more correctly.
Knowledge graphs (KGs) are generally used for various NLP tasks. However, as KGs still miss some information, it is necessary to develop Knowledge Graph Completion (KGC) methods. Most KGC researches do not focus on the Out-of-KGs entities (Unseen-entities), we need a method that can predict the relation for the entity pairs containing Unseen-entities to automatically add new entities to the KGs. In this study, we focus on relation prediction and propose a method to learn entity representations via a graph structure that uses Seen-entities, Unseen-entities and words as nodes created from the descriptions of all entities. In the experiments, our method shows a significant improvement in the relation prediction for the entity pairs containing Unseen-entities.