Zihan Wang
Other people with similar names: Zihan Wang, Zihan Wang, Zihan Wang, Zihan Wang, Zihan Wang
Unverified author pages with similar names: Zihan Wang
2026
MTRouter: Cost-Aware Multi-Turn LLM Routing with History–Model Joint Embeddings
Yiqun Zhang | Hao Li | Zihan Wang | Shi Feng | Xiaocui Yang | Daling Wang | Bo Zhang | Lei Bai | Shuyue Hu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yiqun Zhang | Hao Li | Zihan Wang | Shi Feng | Xiaocui Yang | Daling Wang | Bo Zhang | Lei Bai | Shuyue Hu
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Multi-turn, long-horizon tasks are increasingly common for large language models (LLMs), but solving them typically requires many sequential model invocations, accumulating substantial inference costs. Here, we study cost-aware multi-turn LLM routing: selecting which model to invoke at each turn from a model pool, given a fixed cost budget. We propose MTRouter, which encodes the interaction history and candidate models into joint history–model embeddings, and learns an outcome estimator from logged trajectories to predict turn-level model utility. Experiments show that MTRouter improves the performance–cost trade-off: on ScienceWorld, it surpasses GPT-5 while reducing total cost by 58.7%; on Humanity’s Last Exam (HLE), it achieves competitive accuracy while reducing total cost by 43.4% relative to GPT-5, and these gains even carry over to held-out tasks. Further analyses reveal several mechanisms underlying its effectiveness: relative to prior multi-turn routers, MTRouter makes fewer model switches, is more tolerant to transient errors, and exhibits emergent specialization across models.Code: https://github.com/ZhangYiqun018/MTRouter
DeepMed: Building a Medical DeepResearch Agent via Multi-hop Med-Search Data and Turn-Controlled Agentic Training & Inference
Zihan Wang | Hao Wang | Shi Feng | Xiaocui Yang | Daling Wang | Yiqun Zhang | Jinghao Lin | Xiaozhong Ji | Haihua Yang
Findings of the Association for Computational Linguistics: ACL 2026
Zihan Wang | Hao Wang | Shi Feng | Xiaocui Yang | Daling Wang | Yiqun Zhang | Jinghao Lin | Xiaozhong Ji | Haihua Yang
Findings of the Association for Computational Linguistics: ACL 2026
Medical reasoning models remain constrained by parametric knowledge and are thus susceptible to forgetting and hallucinations. DeepResearch (DR) models ground outputs in verifiable evidence from tools and perform strongly in general domains, but their direct transfer to medical field yields relatively limited gains. We attribute this to two gaps: task characteristic and tool-use scaling. Medical questions require evidence interpretation in a knowledge-intensive clinical context; while general DR models can retrieve information, they often lack clinical-context reasoning and thus “find it but fail to use it,” leaving performance limited by medical abilities. Moreover, in medical scenarios, blindly scaling tool-call can inject noisy context, derailing sensitive medical reasoning and prompting repetitive evidence-seeking along incorrect paths. Therefore, we propose DeepMed. For data, we deploy a multi-hop med-search QA synthesis method supporting the model to apply the DR paradigm in medical contexts. For training, we introduce a difficulty-aware turn-penalty to suppress excessive tool-call growth. For inference, we bring a monitor to help validate hypotheses within a controlled number of steps and avoid context rot. Overall, on seven medical benchmarks, DeepMed improves its base model by 9.79% on average and outperforms larger medical reasoning and DR models.
CIRAG: Construction–Integration Retrieval and Adaptive Generation for Multi-hop Question Answering
Zili Wei | Yilin Wang | Xiaocui Yang | Shi Feng | Weidong Bao | Daling Wang | Zihan Wang | Yifei Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zili Wei | Yilin Wang | Xiaocui Yang | Shi Feng | Weidong Bao | Daling Wang | Zihan Wang | Yifei Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Triple-based Iterative Retrieval-Augmented Generation (iRAG) mitigates document-level noise for multi-hop question answering. However, existing methods still face limitations: (i) greedy single-path expansion, which propagates early errors and fails to capture parallel evidence from different reasoning branches, and (ii) granularity–demand mismatch, where a single evidence representation struggles to balance noise control with contextual sufficiency. In this paper, we propose the Construction–Integration Retrieval and Adaptive Generation model, CIRAG. It introduces an Iterative Construction–Integration module that constructs candidate triples and history-conditionally integrates them to distill core triples and generate the next-hop query. This module mitigates the greedy trap by preserving multiple plausible evidence chains. Besides, to address the granularity–demand mismatch, we propose an Adaptive Cascaded Multi-Granularity Generation module that progressively expands contextual evidence based on the problem requirements, from triples to supporting sentences and full passages. Moreover, we introduce Trajectory Distillation, which distills the teacher model’s integration policy into a lightweight student, enabling efficient and reliable long-horizon reasoning. Extensive experiments demonstrate that CIRAG achieves superior performance compared to existing iRAG methods.