Chengze Ge
2026
GuideTree: Guideline-Induced Review Trees for Long Medical Records
Chengze Ge | Ruiqing Zhang | Yining Wang | Shengping Liu | Liang Jiaen | Weihuang | Weihua Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Chengze Ge | Ruiqing Zhang | Yining Wang | Shengping Liu | Liang Jiaen | Weihuang | Weihua Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Reviewing medical records for clinical and insurance decisions must handle long, heterogeneous documents while producing consistent, traceable, guideline-compliant outcomes under strict latency and cost constraints. We propose GuideTree, which compiles textual guidelines into a fixed review tree of evidence-grounded verification primitives. GuideTree uses short per-document summaries only for routing each check to a minimal set of document types and candidates; final verification always reads full document text and returns structured evidence. The tree is induced offline via a cost-aware split-and-prune search and updated safely through regression-tested, versioned patches. Across 1,000 cases from four industrial review scenarios and four LLM backbones, GuideTree achieves 84.5–92.8 Macro-F1, outperforming the strongest non-expert baselines by 3.3–7.6 points and matching ExpertTree within 0.2–0.6 points (avg. 0.38). On chronic disease with Qwen3-235B-A22B-Instruct, GuideTree reduces average I/O volume to 74K input+output characters (-82% vs. long-context prompting) and average latency to 22s (-83% vs. long-context prompting), while reaching 99% decision consistency over K=5 reruns.
2025
High-Quality Medical Dialogue Synthesis for Improving EMR Generation
Chengze Ge | Yu Xu | Qi Shao | Shengping Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Chengze Ge | Yu Xu | Qi Shao | Shengping Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
High-quality doctor–patient dialogues, by which we mean realistic and human-like interactions that are intent-consistent, clinically faithful, and free of contradictions, are crucial for accurate Electronic Medical Record (EMR) generation. However, collecting large-scale real dialogues is costly and constrained by privacy regulations, while existing synthetic methods often yield rigid and medically inconsistent dialogues. We propose a scalable framework integrating (1) Intent Graph Planning for diverse clinical flows, (2) Dual-Agent Simulation for realistic doctor-patient interactions, and (3) Rule-Reward Quality Control combining explicit medical rules with a self-supervised reward model. Experiments across multiple clinical domains demonstrate that our synthesized dialogues significantly enhance realism, diversity, and downstream EMR quality, substantially reducing physician editing efforts. Our framework provides a practical and privacy-compliant solution for deploying robust clinical NLP systems.