Weihua Chen
2026
GuideTree: Guideline-Induced Review Trees for Long Medical Records
Chengze Ge | Ruiqing Zhang | Yining Wang | Shengping Liu | Liang Jiaen | Weihuang | Weihua Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Chengze Ge | Ruiqing Zhang | Yining Wang | Shengping Liu | Liang Jiaen | Weihuang | Weihua Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Reviewing medical records for clinical and insurance decisions must handle long, heterogeneous documents while producing consistent, traceable, guideline-compliant outcomes under strict latency and cost constraints. We propose GuideTree, which compiles textual guidelines into a fixed review tree of evidence-grounded verification primitives. GuideTree uses short per-document summaries only for routing each check to a minimal set of document types and candidates; final verification always reads full document text and returns structured evidence. The tree is induced offline via a cost-aware split-and-prune search and updated safely through regression-tested, versioned patches. Across 1,000 cases from four industrial review scenarios and four LLM backbones, GuideTree achieves 84.5–92.8 Macro-F1, outperforming the strongest non-expert baselines by 3.3–7.6 points and matching ExpertTree within 0.2–0.6 points (avg. 0.38). On chronic disease with Qwen3-235B-A22B-Instruct, GuideTree reduces average I/O volume to 74K input+output characters (-82% vs. long-context prompting) and average latency to 22s (-83% vs. long-context prompting), while reaching 99% decision consistency over K=5 reruns.