Bridging Health Literacy Gaps in Indian Languages: Multilingual LLMs for Clinical Text Simplification

R S Pavithra


Abstract
We demonstrate how open multilingual LLMs (mT5, IndicTrans2) can simplify complex medical documents into culturally sensitive, patient friendly text in Indian languages, advancing equitable healthcare communication and multilingual scientific accessibility.Clinical documents such as discharge summaries, consent forms, and medication instructions are essential for patient care but are often written in complex, jargon-heavy language. This barrier is intensified in multilingual and low-literacy contexts like India, where linguistic diversity meets limited health literacy. We present a multilingual clinical text simplification pipeline using open large language models (mT5 and IndicTrans2) to automatically rewrite complex medical text into accessible, culturally appropriate, and patient-friendly versions in English, Hindi, Tamil, and Telugu. Using a synthetic dataset of 2,000 discharge summaries, our models achieve up to 42% readability improvement while maintaining factual accuracy. The framework demonstrates how open, reproducible LLMs can bridge linguistic inequities in healthcare communication and support inclusive, patient-centric digital health access in India.
Anthology ID:
2025.sciprodllm-1.1
Volume:
Proceedings of The First Workshop on Human–LLM Collaboration for Ethical and Responsible Science Production (SciProdLLM)
Month:
December
Year:
2025
Address:
Mumbai, India (Hybrid)
Editors:
Wei Zhao, Jennifer D’Souza, Steffen Eger, Anne Lauscher, Yufang Hou, Nafise Sadat Moosavi, Tristan Miller, Chenghua Lin
Venues:
SciProdLLM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–5
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.sciprodllm-1.1/
DOI:
Bibkey:
Cite (ACL):
R S Pavithra. 2025. Bridging Health Literacy Gaps in Indian Languages: Multilingual LLMs for Clinical Text Simplification. In Proceedings of The First Workshop on Human–LLM Collaboration for Ethical and Responsible Science Production (SciProdLLM), pages 1–5, Mumbai, India (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Bridging Health Literacy Gaps in Indian Languages: Multilingual LLMs for Clinical Text Simplification (Pavithra, SciProdLLM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.sciprodllm-1.1.pdf