Reasoning Graph-Structured Question Answering: Datasets and Insights from LLM Benchmarking

Khin Yone, Devasha Trivedi, Anish Pahilajani, Jincen Shuai, Samyak Rajesh Jain, Ryan Rossi, Nesreen K. Ahmed, Franck Dernoncourt, Yu Wang, Namyong Park


Abstract
Large Language Models (LLMs) have shown remarkable success in multi-hop question-answering (M-QA) due to their advanced reasoning capabilities. However, the influence of reasoning structures on their performance remains underexplored, primarily due to the lack of M-QA datasets that explicitly encode the reasoning pathways underlying each question-answer pair. To address this gap, we introduce the reasoning graph-structured question answering dataset (GRS-QA), which provides both semantic contexts and reasoning structures for the QA pairs. Unlike existing M-QA datasets, GRS-QA explicitly captures intricate reasoning pathways through reasoning graphs, where nodes correspond to textual contexts and edges denote logical flows. Using GRS-QA, we systematically evaluate LLM performance across varying context structures, prompting styles, and data domains. Our empirical analysis reveals that LLMs perform differently based on the reasoning structure, context, and prompting styles, indicating their varying ability to leverage graph-structured knowledge. Notably, providing explicit reasoning guidance proves more effective than supplying contextual information alone.
Anthology ID:
2026.lrec-main.414
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
5301–5316
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.414/
DOI:
Bibkey:
Cite (ACL):
Khin Yone, Devasha Trivedi, Anish Pahilajani, Jincen Shuai, Samyak Rajesh Jain, Ryan Rossi, Nesreen K. Ahmed, Franck Dernoncourt, Yu Wang, and Namyong Park. 2026. Reasoning Graph-Structured Question Answering: Datasets and Insights from LLM Benchmarking. International Conference on Language Resources and Evaluation, main:5301–5316.
Cite (Informal):
Reasoning Graph-Structured Question Answering: Datasets and Insights from LLM Benchmarking (Yone et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.414.pdf