Yuxuan Zhou
Papers on this page may belong to the following people: Yuxuan Zhou
2026
Is GraphRAG Needed? From Basic RAG to Graph-/Agentic Solutions with Context Optimization
Long Chen | Ryan Razkenari | Yuxuan Zhou | Yuan Tian | Rahul Ghosh | Venkatesh Pappakrishnan | Disha Ahuja | Vidya Sagar Ravipati
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
Long Chen | Ryan Razkenari | Yuxuan Zhou | Yuan Tian | Rahul Ghosh | Venkatesh Pappakrishnan | Disha Ahuja | Vidya Sagar Ravipati
Proceedings of the Fifth Workshop on Generation, Evaluation and Metrics (GEM)
As advanced RAG variants like GraphRAG and Agentic RAG emerge, one leading question is when and how to use them. Here, we introduce a framework for different RAG scenarios evaluation and comparison on semi-structured knowledge bases, including regular RAG, GraphRAG, Modular RAG and Agentic RAG. We provide implementation for 9 standardized RAG scenarios, and conduct experiments for a comprehensive comparison. These scenarios are designed for real use cases regarding data and domain restrictions, spanning from simple document-based retrieval to advanced features such as hybrid text-graph retrieval, integration with computed or pre-defined domain knowledge graphs, agentic multi-step planning, and agent-graph integration. Besides, we present a novel context engineering method for GraphRAG and Agentic RAG, addressing the context/memory overflow issues, efficiently managing text and graph retrievals with new representations and agentic loop design, leading to 19%-53% reduction on token usage. Moreover, further analysis identifies a retrieval-generation gap where expanded retrieval does not proportionally improve generation quality, suggesting retrieval-oriented metrics overstate advanced retrieval benefits. This work provides data-driven insights on when and how to use them for building production-ready intelligent RAG systems.
2023
THiFLY Research at SemEval-2023 Task 7: A Multi-granularity System for CTR-based Textual Entailment and Evidence Retrieval
Yuxuan Zhou | Ziyu Jin | Meiwei Li | Miao Li | Xien Liu | Xinxin You | Ji Wu
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Yuxuan Zhou | Ziyu Jin | Meiwei Li | Miao Li | Xien Liu | Xinxin You | Ji Wu
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
The NLI4CT task aims to entail hypotheses based on Clinical Trial Reports (CTRs) and retrieve the corresponding evidence supporting the justification. This task poses a significant challenge, as verifying hypotheses in the NLI4CT task requires the integration of multiple pieces of evidence from one or two CTR(s) and the application of diverse levels of reasoning, including textual and numerical. To address these problems, we present a multi-granularity system for CTR-based textual entailment and evidence retrieval in this paper. Specifically, we construct a Multi-granularity Inference Network (MGNet) that exploits sentence-level and token-level encoding to handle both textual entailment and evidence retrieval tasks. Moreover, we enhance the numerical inference capability of the system by leveraging a T5-based model, SciFive, which is pre-trained on the medical corpus. Model ensembling and a joint inference method are further utilized in the system to increase the stability and consistency of inference. The system achieves f1-scores of 0.856 and 0.853 on textual entailment and evidence retrieval tasks, resulting in the best performance on both subtasks. The experimental results corroborate the effectiveness of our proposed method.
2022
Table-based Fact Verification with Self-adaptive Mixture of Experts
Yuxuan Zhou | Xien Liu | Kaiyin Zhou | Ji Wu
Findings of the Association for Computational Linguistics: ACL 2022
Yuxuan Zhou | Xien Liu | Kaiyin Zhou | Ji Wu
Findings of the Association for Computational Linguistics: ACL 2022
The table-based fact verification task has recently gained widespread attention and yet remains to be a very challenging problem. It inherently requires informative reasoning over natural language together with different numerical and logical reasoning on tables (e.g., count, superlative, comparative). Considering that, we exploit mixture-of-experts and present in this paper a new method: Self-adaptive Mixture-of-Experts Network (SaMoE). Specifically, we have developed a mixture-of-experts neural network to recognize and execute different types of reasoning—the network is composed of multiple experts, each handling a specific part of the semantics for reasoning, whereas a management module is applied to decide the contribution of each expert network to the verification result. A self-adaptive method is developed to teach the management module combining results of different experts more efficiently without external knowledge. The experimental results illustrate that our framework achieves 85.1% accuracy on the benchmark dataset TabFact, comparable with the previous state-of-the-art models. We hope our framework can serve as a new baseline for table-based verification. Our code is available at https://github.com/THUMLP/SaMoE.
2021
THiFly_Queens at SemEval-2021 Task 9: Two-stage Statement Verification with Adaptive Ensembling and Slot-based Operation
Yuxuan Zhou | Kaiyin Zhou | Xien Liu | Ji Wu | Xiaodan Zhu
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
Yuxuan Zhou | Kaiyin Zhou | Xien Liu | Ji Wu | Xiaodan Zhu
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
This paper describes our system for verifying statements with tables at SemEval-2021 Task 9. We developed a two-stage verifying system based on the latest table-based pre-trained model GraPPa. Multiple networks are devised to verify different types of statements in the competition dataset and an adaptive model ensembling technique is applied to ensemble models in both stages. A statement-slot-based symbolic operation module is also used in our system to further improve the performance and stability of the system. Our model achieves second place in the 3-way classification and fourth place in the 2-way classification evaluation. Several ablation experiments show the effectiveness of different modules proposed in this paper.