2025
pdf
bib
abs
Automated Structured Radiology Report Generation
Jean-Benoit Delbrouck
|
Justin Xu
|
Johannes Moll
|
Alois Thomas
|
Zhihong Chen
|
Sophie Ostmeier
|
Asfandyar Azhar
|
Kelvin Zhenghao Li
|
Andrew Johnston
|
Christian Bluethgen
|
Eduardo Pontes Reis
|
Mohamed S Muneer
|
Maya Varma
|
Curtis Langlotz
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Automated radiology report generation from chest X-ray (CXR) images has the potential to improve clinical efficiency and reduce radiologists’ workload. However, most datasets, including the publicly available MIMIC-CXR and CheXpert Plus, consist entirely of free-form reports, which are inherently variable and unstructured. This variability poses challenges for both generation and evaluation: existing models struggle to produce consistent, clinically meaningful reports, and standard evaluation metrics fail to capture the nuances of radiological interpretation. To address this, we introduce Structured Radiology Report Generation (SRRG), a new task that reformulates free-text radiology reports into a standardized format, ensuring clarity, consistency, and structured clinical reporting. We create a novel dataset by restructuring reports using large language models (LLMs) following strict structured reporting desiderata. Additionally, we introduce SRR-BERT, a fine-grained disease classification model trained on 55 labels, enabling more precise and clinically informed evaluation of structured reports. To assess report quality, we propose F1-SRR-BERT, a metric that leverages SRR-BERT’s hierarchical disease taxonomy to bridge the gap between free-text variability and structured clinical reporting. We validate our dataset through a reader study conducted by five board-certified radiologists and extensive benchmarking experiments.
2024
pdf
bib
abs
Overview of the First Shared Task on Clinical Text Generation: RRG24 and “Discharge Me!”
Justin Xu
|
Zhihong Chen
|
Andrew Johnston
|
Louis Blankemeier
|
Maya Varma
|
Jason Hom
|
William J. Collins
|
Ankit Modi
|
Robert Lloyd
|
Benjamin Hopkins
|
Curtis Langlotz
|
Jean-Benoit Delbrouck
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Recent developments in natural language generation have tremendous implications for healthcare. For instance, state-of-the-art systems could automate the generation of sections in clinical reports to alleviate physician workload and streamline hospital documentation. To explore these applications, we present a shared task consisting of two subtasks: (1) Radiology Report Generation (RRG24) and (2) Discharge Summary Generation (“Discharge Me!”). RRG24 involves generating the ‘Findings’ and ‘Impression’ sections of radiology reports given chest X-rays. “Discharge Me!” involves generating the ‘Brief Hospital Course’ and '‘Discharge Instructions’ sections of discharge summaries for patients admitted through the emergency department. “Discharge Me!” submissions were subsequently reviewed by a team of clinicians. Both tasks emphasize the goal of reducing clinician burnout and repetitive workloads by generating documentation. We received 201 submissions from across 8 teams for RRG24, and 211 submissions from across 16 teams for “Discharge Me!”.
pdf
bib
abs
RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports
Jean-Benoit Delbrouck
|
Pierre Chambon
|
Zhihong Chen
|
Maya Varma
|
Andrew Johnston
|
Louis Blankemeier
|
Dave Van Veen
|
Tan Bui
|
Steven Truong
|
Curtis Langlotz
Findings of the Association for Computational Linguistics: ACL 2024
In order to enable extraction of structured clinical data from unstructured radiology reports, we introduce RadGraph-XL, a large-scale, expert-annotated dataset for clinical entity and relation extraction. RadGraph-XL consists of 2,300 radiology reports, which are annotated with over 410,000 entities and relations by board-certified radiologists. Whereas previous approaches focus solely on chest X-rays, RadGraph-XL includes data from four anatomy-modality pairs - chest CT, abdomen/pelvis CT, brain MR, and chest X-rays. Then, in order to automate structured information extraction, we use RadGraph-XL to train transformer-based models for clinical entity and relation extraction. Our evaluations include comprehensive ablation studies as well as an expert reader study that evaluates trained models on out-of-domain data. Results demonstrate that our model surpasses the performance of previous methods by up to 52% and notably outperforms GPT-4 in this domain. We release RadGraph-XL as well as our trained model to foster further innovation and research in structured clinical information extraction.