Yuhang Tian

2025

Knowledge Base Question Answering (KBQA) aims to extract accurate answers from the Knowledge Base (KB). Traditional Semantic Parsing (SP)-based methods are widely used but struggle with complex queries. Recently, large language models (LLMs) have shown promise in improving KBQA performance. However, the challenge of generating error-free logical forms remains, as skeleton, topic Entity, and relation Errors still frequently occur. To address these challenges, we propose CompKBQA(Component-wise Task Decomposition for Knowledge Base Question Answering), a novel framework that optimizes the process of fine-tuning a LLM for generating logical forms by enabling the LLM to progressively learn relevant sub-tasks like skeleton generation, topic entity generation, and relevant relations generation. Additionally, we propose R³, which retrieves and incorporates KB information into the process of logical form generation. Experimental evaluations on two benchmark KBQA datasets, WebQSP and CWQ, demonstrate that CompKBQA achieves state-of-the-art performance, highlighting the importance of task decomposition and KB-aware learning.

pdf bib abs
GRV-KBQA: A Three-Stage Framework for Knowledge Base Question Answering with Decoupled Logical Structure, Semantic Grounding and Structure-Aware Validation
Yuhang Tian | Pan Yang | Dandan Song | Zhijing Wu | Hao Wang
Findings of the Association for Computational Linguistics: EMNLP 2025

Knowledge Base Question Answering (KBQA) is a fundamental task that enables natural language interaction with structured knowledge bases (KBs).Given a natural language question, KBQA aims to retrieve the answers from the KB. However, existing approaches, including retrieval-based, semantic parsing-based methods and large-language model-based methods often suffer from generating non-executable queries and inefficiencies in query execution. To address these challenges, we propose GRV-KBQA, a three-stage framework that decouples logical structure generation from semantic grounding and incorporates structure-aware validation to enhance accuracy. Unlike previous methods, GRV-KBQA explicitly enforces KB constraints to improve alignment between generated logical forms and KB structures. Experimental results on WebQSP and CWQ show that GRV-KBQA significantly improves performance over existing approaches. The ablation study conducted confirms the effectiveness of the decoupled logical form generation and validation mechanism of our framework.

pdf bib abs
Path-enhanced Pre-trained Language Model for Knowledge Graph Completion
Hao Wang | Dandan Song | Zhijing Wu | Yuhang Tian | Pan Yang
Findings of the Association for Computational Linguistics: EMNLP 2025

Pre-trained language models (PLMs) have achieved remarkable knowledge graph completion(KGC) success. However, most methods derive KGC results mainly from triple-level and text-described learning, which lack the capability to capture long-term relational and structural information. Moreover, the absence of a visible reasoning process leads to poor interpretability and credibility of the completions. In this paper, we propose a path-enhanced pre-trained language model-based knowledge graph completion method (PEKGC), which employs multi-view generation to infer missing facts in triple-level and path-level simultaneously to address lacking long-term relational information and interpretability issues. Furthermore, a neighbor selector module is proposed to filter neighbor triples to provide the adjacent structural information. Besides, we propose a fact-level re-evaluation and a heuristic fusion ranking strategy for candidate answers to fuse multi-view predictions. Extensive experiments on the benchmark datasets demonstrate that our model significantly improves the performance of the KGC task.

2024

Knowledge graphs (KGs) can provide explainable reasoning for large language models (LLMs), alleviating their hallucination problem. Knowledge graph question answering (KGQA) is a typical benchmark to evaluate the methods enhancing LLMs with KG. Previous methods on KG-enhanced LLM for KGQA either enhance LLMs with KG retrieval in a single round or perform multi-hop KG reasoning in multiple rounds with LLMs. Both of them conduct retrieving and reasoning based solely on the whole original question, without any processing to the question. To tackle this limitation, we propose a framework of KG-enhanced LLM based on question decomposition and atomic retrieval, called KELDaR. We introduce question decomposition tree as the framework for LLM reasoning. This approach extracts the implicit information of reasoning steps within complex questions, serving as a guide to facilitate atomic retrieval on KG targeting the atomic-level simple questions at leaves of the tree. Additionally, we design strategies for atomic retrieval, which extract and retrieve question-relevant KG subgraphs to assist the few-shot LLM in answering atomic-level questions. Experiments on KGQA datasets demonstrate that our framework outperforms existing reasoning-based baselines. And in a low-cost setting without additional training or fine-tuning, our framework achieves competitive or superior results compared to most existing training-based baselines.

Recently, significant progress has been made in employing Large Language Models (LLMs) for semantic parsing to address Knowledge Base Question Answering (KBQA) tasks. Previous work utilize LLMs to generate query statements on Knowledge Bases (KBs) for retrieving answers. However, LLMs often generate incorrect query statements due to the lack of relevant knowledge in the previous methods. To address this, we propose a framework called Augmenting Reasoning Capabilities of LLMs with Graph Structures in Knowledge Base Question Answering (ARG-KBQA), which retrieves question-related graph structures to improve the performance of LLMs. Unlike other methods that directly retrieve relations or triples from KBs, we introduce an unsupervised two-stage ranker to perform multi-hop beam search on KBs, which could provide LLMs with more relevant information to the questions. Experimental results demonstrate that ARG-KBQA sets a new state-of-the-art on GrailQA and WebQSP under the few-shot setting. Additionally, ARG-KBQA significantly outperforms previous few-shot methods on questions with unseen query statement in the training data.

Co-authors

Jing Xu 1

Venues

findings4
emnlp1

Fix author