Shiwei Lyu
2026
ClinAlign: Scaling Healthcare Alignment from Clinician Preference
Shiwei Lyu | Xidong Wang | Hao Zhu | Lei Liu | Chaohe Zhang | Jian Wang | Jinjie Gu | Benyou Wang | Yue Shen
Findings of the Association for Computational Linguistics: ACL 2026
Shiwei Lyu | Xidong Wang | Hao Zhu | Lei Liu | Chaohe Zhang | Jian Wang | Jinjie Gu | Benyou Wang | Yue Shen
Findings of the Association for Computational Linguistics: ACL 2026
Although large language models (LLMs) demonstrate expert-level medical knowledge, aligning their open-ended outputs with fine-grained clinician preferences remains challenging. Existing methods often rely on coarse objectives or unreliable automated judges that are weakly grounded in professional guidelines. We propose a two-stage framework to address this gap. First, we introduce HealthRubrics, a dataset of 7,034 physician-verified preference examples in which clinicians refine LLM-drafted rubrics to meet rigorous medical standards. Second, we distill these rubrics into HealthPrinciples: 119 broadly reusable, clinically grounded principles organized by clinical dimensions, enabling scalable supervision beyond manual annotation. We use HealthPrinciples for (1) offline alignment by synthesizing rubrics for unlabeled queries and (2) an inference-time tool for guided self-revision. A 30A3B model with our framework achieves 33.4% on HealthBench-Hard, outperforming much larger models including Deepseek-R1 and o3, establishing a resource-efficient baseline for clinical alignment.
2025
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents
Yuqi Zhu | Shuofei Qiao | Yixin Ou | Shumin Deng | Shiwei Lyu | Yue Shen | Lei Liang | Jinjie Gu | Huajun Chen | Ningyu Zhang
Findings of the Association for Computational Linguistics: NAACL 2025
Yuqi Zhu | Shuofei Qiao | Yixin Ou | Shumin Deng | Shiwei Lyu | Yue Shen | Lei Liang | Jinjie Gu | Huajun Chen | Ningyu Zhang
Findings of the Association for Computational Linguistics: NAACL 2025
Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories during task solving and results in planning hallucination. To address this issue, we introduce KnowAgent, a novel approach designed to enhance the planning capabilities of LLMs by incorporating explicit action knowledge. Specifically, KnowAgent employs an action knowledge base and a knowledgeable self-learning strategy to constrain the action path during planning, enabling more reasonable trajectory synthesis, and thereby enhancing the planning performance of language agents. Experimental results on HotpotQA and ALFWorld based on various backbone models demonstrate that KnowAgent can achieve comparable or superior performance to existing baselines. Further analysis indicates the effectiveness of KnowAgent in terms of planning hallucinations mitigation.