Zheng Li

Other people with similar names: Zheng Li, Zheng Li

Unverified author pages with similar names: Zheng Li

2026

Interactive medical questioning is essential in clinical consultations, where physicians must actively gather necessary patient information. Yet existing medical Large Language Models (LLMs) predominantly follow a reactive paradigm, risking diagnostic errors by answering before seeking sufficient details. To bridge this gap, we propose ProMed, a reinforcement learning framework that transitions LLMs toward a proactive paradigm, enabling them to ask clinically valuable questions before decision-making. Central to ProMed is the Shapley Information Gain (SIG) reward, which quantifies a question’s clinical utility as the amount of newly acquired information, while considering its contextual importance via Shapley values. We integrate SIG into a two-stage training pipeline: (1) SIG-Guided Model Initialization uses Monte Carlo Tree Search to construct high-reward interaction trajectories for supervision, and (2) SIG-Augmented Policy Optimization, with a novel SIG-guided Reward Distribution Mechanism that prioritizes informative questions for fine-grained optimization. Experiments on partial-information medical benchmarks show that ProMed significantly outperforms state-of-the-art methods by 6.29% on average and delivers a 54.45% gain over the reactive paradigm, and generalizes robustly to out-of-domain cases. Our codes are available at https://github.com/hxxding/ProMed.

pdf bib abs

Personality detection aims to label an individual’s traits via identifying linguistic cues from his or her written text. Previous approaches typically perform a direct mapping between text and trait labels or apply static reasoning to this task.In this paper, we argue that dynamic reasoning, underpinned by psychological theory, is essential for personality trait inference. To address this, we propose PsyPath, a novel framework that models personality detection as a process of psychologically-guided self-exploration. By enabling large language models (LLMs) to dynamically generate and answer psychologically meaningful questions, our method creates a dynamic reasoning path to explore the underlying dimensions of personality traits. This mechanism not only makes the reasoning process transparent, but also helps the model understand personality nuances in a way that mirrors expert psychological reasoning.For the "guided self-exploration", we propose a novel hybrid scoring mechanism to step-by-step evaluate the generated nodes in the reasoning paths that balances psychological coherence (black-box scoring) and model output dynamics (white-box scoring). This reasoning-based formulation inherently reflects how psychologists assess personality, as they rely on iterative, diagnostic reasoning. Experiments on two benchmark datasets demonstrate that PsyPath consistently outperforms strong baselines, yielding improvements in predictive accuracy and model interpretability.Moreover, the generated reasoning paths provide psychologically meaningful training data, significantly improving performance and psychologically grounded interpretability in downstream tasks.

Co-authors

Venues

ACL1
Findings1

Fix author