iHealth-Chile-1 at RRG24: In-context Learning and Finetuning of a Large Multimodal Model for Radiology Report Generation
Diego Campanini, Oscar Loch, Pablo Messina, Rafael Elberg, Denis Parra
Abstract
This paper presents the approach of the iHealth-Chile-1 team for the shared task of Large-Scale Radiology Report Generation at the BioNLP workshop, inspired by progress in large multimodal models for processing images and text. In this work, we leverage LLaVA, a Visual-Language Model (VLM), composed of a vision-encoder, a vision-language connector or adapter, and a large language model able to process text and visual embeddings. We achieve our best result by enriching the input prompt of LLaVA with the text output of a simpler report generation model. With this enriched-prompt technique, we improve our results in 4 of 5 metrics (BLEU-4, Rouge-L, BertScore and F1-RadGraph,), only doing in-context learning. Moreover, we provide details about different architecture settings, fine-tuning strategies, and dataset configurations.- Anthology ID:
- 2024.bionlp-1.52
- Volume:
- Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand
- Editors:
- Dina Demner-Fushman, Sophia Ananiadou, Makoto Miwa, Kirk Roberts, Junichi Tsujii
- Venues:
- BioNLP | WS
- SIG:
- SIGBIOMED
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 608–613
- Language:
- URL:
- https://aclanthology.org/2024.bionlp-1.52
- DOI:
- 10.18653/v1/2024.bionlp-1.52
- Cite (ACL):
- Diego Campanini, Oscar Loch, Pablo Messina, Rafael Elberg, and Denis Parra. 2024. iHealth-Chile-1 at RRG24: In-context Learning and Finetuning of a Large Multimodal Model for Radiology Report Generation. In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, pages 608–613, Bangkok, Thailand. Association for Computational Linguistics.
- Cite (Informal):
- iHealth-Chile-1 at RRG24: In-context Learning and Finetuning of a Large Multimodal Model for Radiology Report Generation (Campanini et al., BioNLP-WS 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2024.bionlp-1.52.pdf