Zhuangdi Zhu
2026
Dialogue is Better Than Monologue: Instructing Meidcal LLMs via Strategic Conversations
Zijie Liu | Xinyu Zhao | Jie Peng | Jinhao Duan | Zhuangdi Zhu | Qingyu Chen | Kaidi Xu | Xia Hu | Tianlong Chen
Findings of the Association for Computational Linguistics: EACL 2026
Zijie Liu | Xinyu Zhao | Jie Peng | Jinhao Duan | Zhuangdi Zhu | Qingyu Chen | Kaidi Xu | Xia Hu | Tianlong Chen
Findings of the Association for Computational Linguistics: EACL 2026
In real clinical practice, clinicians must sift through noisy and often conflicting information, progressively gathering and sequencing evidence before reaching conclusions. However, existing tuning methods for medical AI models are typically monologue-based — that is, models are fine-tuned on static question answering (QA) tasks or medical articles, which fail to reflect the interactive and iterative nature of clinical reasoning. To bridge this gap, we introduce MuddyMaze, a benchmark designed to expose the limitations of current monologue-based tuning, and construct a large dialogue dataset of 22.2k doctor–patient interactions that capture stepwise diagnostic reasoning validated by medical experts. Building on those, we propose dialogue-tuning, a new fine-tuning paradigm that captures the internal reasoning dynamics unfolding across interactions.To assess the effectiveness of our approach, we evaluated dialogue-tuned models on MuddyMaze, where they outperform monologue-tuned baselines (e.g., MedQA) by +16.1% in one-round and +4.1% in multi-round evidence ranking, while maintaining or even improving accuracy on standard medical QA benchmarks (e.g., PubMedQA). These results indicate that dialogue-tuning not only enhances reasoning robustness and evidence integration but also preserves the factual precision of traditional QA performance.
2025
Web Intellectual Property at Risk: Preventing Unauthorized Real-Time Retrieval by Large Language Models
Yisheng Zhong | Yizhu Wen | Junfeng Guo | Mehran Kafai | Heng Huang | Hanqing Guo | Zhuangdi Zhu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yisheng Zhong | Yizhu Wen | Junfeng Guo | Mehran Kafai | Heng Huang | Hanqing Guo | Zhuangdi Zhu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
The protection of cyber Intellectual Property (IP) such as web content is an increasingly critical concern. The rise of large language models (LLMs) with online retrieval capabilities enables convenient access to information but often undermines the rights of original content creators. As users increasingly rely on LLM-generated responses, they gradually diminish direct engagement with original information sources, which will significantly reduce the incentives for IP creators to contribute, and lead to a saturating cyberspace with more AI-generated content. In response, we propose a novel defense framework that empowers web content creators to safeguard their web-based IP from unauthorized LLM real-time extraction and redistribution by leveraging the semantic understanding capability of LLMs themselves. Our method follows principled motivations and effectively addresses an intractable black-box optimization problem. Real-world experiments demonstrated that our methods improve defense success rates from 2.5% to 88.6% on different LLMs, outperforming traditional defenses such as configuration-based restrictions.