Libra: Leveraging Temporal Images for Biomedical Radiology Analysis

Xi Zhang; Zaiqiao Meng; Jake Lever; Edmond S. L. Ho

Libra: Leveraging Temporal Images for Biomedical Radiology Analysis

Xi Zhang, Zaiqiao Meng, Jake Lever, Edmond S. L. Ho

Abstract

Radiology report generation (RRG) requires advanced medical image analysis, effective temporal reasoning, and accurate text generation. While multimodal large language models (MLLMs) align with pre-trained vision encoders to enhance visual-language understanding, most existing methods rely on single-image analysis or rule-based heuristics to process multiple images, failing to fully leverage temporal information in multi-modal medical datasets. In this paper, we introduce **Libra**, a temporal-aware MLLM tailored for chest X-ray report generation. Libra combines a radiology-specific image encoder with a novel Temporal Alignment Connector (**TAC**), designed to accurately capture and integrate temporal differences between paired current and prior images. Extensive experiments on the MIMIC-CXR dataset demonstrate that Libra establishes a new state-of-the-art benchmark among similarly scaled MLLMs, setting new standards in both clinical relevance and lexical accuracy. All source code and data are publicly available at: https://github.com/X-iZhang/Libra.

Anthology ID:: 2025.findings-acl.888
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:: Findings | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 17275–17303
Language:
URL:: https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.888/
DOI:
Bibkey:
Cite (ACL):: Xi Zhang, Zaiqiao Meng, Jake Lever, and Edmond S. L. Ho. 2025. Libra: Leveraging Temporal Images for Biomedical Radiology Analysis. In Findings of the Association for Computational Linguistics: ACL 2025, pages 17275–17303, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Libra: Leveraging Temporal Images for Biomedical Radiology Analysis (Zhang et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.888.pdf

PDF Cite Search Fix data