ProMed: Shapley Information Gain Guided Reinforcement Learning for Proactive Medical LLMs

Hongxin Ding; Baixiang Huang; Yue Fang; Weibin Liao; Xinke Jiang; Jinyang Zhang; Yinghao Zhu; Zheng Li; Liantao Ma; Junfeng Zhao; Yasha Wang

ProMed: Shapley Information Gain Guided Reinforcement Learning for Proactive Medical LLMs

Hongxin Ding, Baixiang Huang, Yue Fang, Weibin Liao, Xinke Jiang, Jinyang Zhang, Yinghao Zhu, Zheng Li, Liantao Ma, Junfeng Zhao, Yasha Wang

Abstract

Interactive medical questioning is essential in clinical consultations, where physicians must actively gather necessary patient information. Yet existing medical Large Language Models (LLMs) predominantly follow a reactive paradigm, risking diagnostic errors by answering before seeking sufficient details. To bridge this gap, we propose ProMed, a reinforcement learning framework that transitions LLMs toward a proactive paradigm, enabling them to ask clinically valuable questions before decision-making. Central to ProMed is the Shapley Information Gain (SIG) reward, which quantifies a question’s clinical utility as the amount of newly acquired information, while considering its contextual importance via Shapley values. We integrate SIG into a two-stage training pipeline: (1) SIG-Guided Model Initialization uses Monte Carlo Tree Search to construct high-reward interaction trajectories for supervision, and (2) SIG-Augmented Policy Optimization, with a novel SIG-guided Reward Distribution Mechanism that prioritizes informative questions for fine-grained optimization. Experiments on partial-information medical benchmarks show that ProMed significantly outperforms state-of-the-art methods by 6.29% on average and delivers a 54.45% gain over the reactive paradigm, and generalizes robustly to out-of-domain cases. Our codes are available at https://github.com/hxxding/ProMed.

Anthology ID:: 2026.acl-long.1500
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32481–32515
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1500/
DOI:
Bibkey:
Cite (ACL):: Hongxin Ding, Baixiang Huang, Yue Fang, Weibin Liao, Xinke Jiang, Jinyang Zhang, Yinghao Zhu, Zheng Li, Liantao Ma, Junfeng Zhao, and Yasha Wang. 2026. ProMed: Shapley Information Gain Guided Reinforcement Learning for Proactive Medical LLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 32481–32515, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: ProMed: Shapley Information Gain Guided Reinforcement Learning for Proactive Medical LLMs (Ding et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1500.pdf
Checklist:: 2026.acl-long.1500.checklist.pdf

PDF Cite Search Checklist Fix data