Chenxin Diao


Fixing paper assignments

  1. Please select all papers that do not belong to this person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Don’t Forget the Base Retriever! A Low-Resource Graph-based Retriever for Multi-hop Question Answering
Andre Melo | Enting Chen | Pavlos Vougiouklis | Chenxin Diao | Shriram Piramanayagam | Ruofei Lai | Jeff Z. Pan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track

Traditional Retrieval-augmented Generation systems struggle with complex multi-hop questions, which often require reasoning over multiple passages. While GraphRAG approaches address these challenges, most of them rely on expensive LLM calls. In this paper, we propose GR\small{IEVER}, a lightweight, low-resource, multi-step graph-based retriever for multi-hop QA. Unlike prior work, GR\small{IEVER} does not rely on LLMs and can perform multi-step retrieval in a few hundred milliseconds. It efficiently indexes passages alongside an associated knowledge graph and employs a hybrid retriever combined with aggressive filtering to reduce retrieval latency. Experiments on multi-hop QA datasets demonstrate that GR\small{IEVER} outperforms conventional retrievers and shows strong potential as a base retriever within multi-step agentic frameworks.

pdf bib
GeAR: Graph-enhanced Agent for Retrieval-augmented Generation
Zhili Shen | Chenxin Diao | Pavlos Vougiouklis | Pascual Merita | Shriram Piramanayagam | Enting Chen | Damien Graux | Andre Melo | Ruofei Lai | Zeren Jiang | Zhongyang Li | Ye Qi | Yang Ren | Dandan Tu | Jeff Z. Pan
Findings of the Association for Computational Linguistics: ACL 2025

Retrieval-augmented Generation (RAG) relies on effective retrieval capabilities, yet traditional sparse and dense retrievers inherently struggle with multi-hop retrieval scenarios. In this paper, we introduce G\small{E}\normalsize{AR}, a system that advances RAG performance through two key innovations: (i) an efficient graph expansion mechanism that augments any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates the resulting graph-based retrieval into a multi-step retrieval framework. Our evaluation demonstrates G\small{E}\normalsize{AR}‘s superior retrieval capabilities across three multi-hop question answering datasets. Notably, our system achieves state-of-the-art results with improvements exceeding 10% on the challenging MuSiQue dataset, while consuming fewer tokens and requiring fewer iterations than existing multi-step retrieval systems. The project page is available at https://gear-rag.github.io.

2024

pdf bib
Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning
Zhili Shen | Pavlos Vougiouklis | Chenxin Diao | Kaustubh Vyas | Yuanyi Ji | Jeff Z. Pan
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

2023

pdf bib
FastRAT: Fast and Efficient Cross-lingual Text-to-SQL Semantic Parsing
Pavlos Vougiouklis | Nikos Papasarantopoulos | Danna Zheng | David Tuckey | Chenxin Diao | Zhili Shen | Jeff Pan
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)