Ruiying Chen


2026

Retrieval-Augmented Generation (RAG) is widely employed to mitigate risks such as hallucinations and knowledge obsolescence in medical question answering, yet its predominantly single-round, static retrieval paradigm misaligns with the multi-stage process of clinical reasoning. This compressed workflow induces two structural deficiencies: question-to-query translation often lacks clinically grounded semantic interpretation, and retrieval lacks iterative sufficiency feedback, making it difficult to form reliable evidence chains. We argue that both issues stem from a deeper cause—overloading a single reasoning chain with heterogeneous tasks of interpretation, exploration, and adjudication—and that the remedy is to reconstruct the workflow via task decoupling and dynamic multi-round exploration. To this end, we propose the Self-Evolving Multi-Agent framework **SEMA-RAG**, which assigns these roles to three specialist agents: **Interpreter Agent** for clinical schema interpretation, **Explorer Agent** for sufficiency-driven self-evolving retrieval, and **Arbiter Agent** for evidence adjudication and answer selection. Across five benchmarks and five LLM backbones, SEMA-RAG improves the strongest baseline by **+6.46** accuracy points on average, measured per backbone.