Towards Understanding LLM-Generated Biomedical Lay Summaries

Rohan Charudatt Salvi, Swapnil Panigrahi, Dhruv Jain, Shweta Yadav, Md. Shad Akhtar


Abstract
In this paper, we investigate using large language models to generate accessible lay summaries of medical abstracts, targeting non-expert audiences. We assess the ability of models like GPT-4 and LLaMA 3-8B-Instruct to simplify complex medical information, focusing on layness, comprehensiveness, and factual accuracy. Utilizing both automated and human evaluations, we discover that automatic metrics do not always align with human judgments. Our analysis highlights the potential benefits of developing clear guidelines for consistent evaluations conducted by non-expert reviewers. It also points to areas for improvement in the evaluation process and the creation of lay summaries for future research.
Anthology ID:
2025.cl4health-1.22
Volume:
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)
Month:
May
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Sophia Ananiadou, Dina Demner-Fushman, Deepak Gupta, Paul Thompson
Venues:
CL4Health | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
260–268
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.cl4health-1.22/
DOI:
Bibkey:
Cite (ACL):
Rohan Charudatt Salvi, Swapnil Panigrahi, Dhruv Jain, Shweta Yadav, and Md. Shad Akhtar. 2025. Towards Understanding LLM-Generated Biomedical Lay Summaries. In Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health), pages 260–268, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Towards Understanding LLM-Generated Biomedical Lay Summaries (Salvi et al., CL4Health 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.cl4health-1.22.pdf