2025
pdf
bib
abs
Don’t Forget the Base Retriever! A Low-Resource Graph-based Retriever for Multi-hop Question Answering
Andre Melo
|
Enting Chen
|
Pavlos Vougiouklis
|
Chenxin Diao
|
Shriram Piramanayagam
|
Ruofei Lai
|
Jeff Z. Pan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Traditional Retrieval-augmented Generation systems struggle with complex multi-hop questions, which often require reasoning over multiple passages. While GraphRAG approaches address these challenges, most of them rely on expensive LLM calls. In this paper, we propose GR\small{IEVER}, a lightweight, low-resource, multi-step graph-based retriever for multi-hop QA. Unlike prior work, GR\small{IEVER} does not rely on LLMs and can perform multi-step retrieval in a few hundred milliseconds. It efficiently indexes passages alongside an associated knowledge graph and employs a hybrid retriever combined with aggressive filtering to reduce retrieval latency. Experiments on multi-hop QA datasets demonstrate that GR\small{IEVER} outperforms conventional retrievers and shows strong potential as a base retriever within multi-step agentic frameworks.
pdf
bib
abs
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation
Zhili Shen
|
Chenxin Diao
|
Pavlos Vougiouklis
|
Pascual Merita
|
Shriram Piramanayagam
|
Enting Chen
|
Damien Graux
|
Andre Melo
|
Ruofei Lai
|
Zeren Jiang
|
Zhongyang Li
|
Ye Qi
|
Yang Ren
|
Dandan Tu
|
Jeff Z. Pan
Findings of the Association for Computational Linguistics: ACL 2025
Retrieval-augmented Generation (RAG) relies on effective retrieval capabilities, yet traditional sparse and dense retrievers inherently struggle with multi-hop retrieval scenarios. In this paper, we introduce G\small{E}\normalsize{AR}, a system that advances RAG performance through two key innovations: (i) an efficient graph expansion mechanism that augments any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates the resulting graph-based retrieval into a multi-step retrieval framework. Our evaluation demonstrates G\small{E}\normalsize{AR}‘s superior retrieval capabilities across three multi-hop question answering datasets. Notably, our system achieves state-of-the-art results with improvements exceeding 10% on the challenging MuSiQue dataset, while consuming fewer tokens and requiring fewer iterations than existing multi-step retrieval systems. The project page is available at https://gear-rag.github.io.
2024
pdf
bib
Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning
Zhili Shen
|
Pavlos Vougiouklis
|
Chenxin Diao
|
Kaustubh Vyas
|
Yuanyi Ji
|
Jeff Z. Pan
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
2023
pdf
bib
FastRAT: Fast and Efficient Cross-lingual Text-to-SQL Semantic Parsing
Pavlos Vougiouklis
|
Nikos Papasarantopoulos
|
Danna Zheng
|
David Tuckey
|
Chenxin Diao
|
Zhili Shen
|
Jeff Pan
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)