STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases

Mounica Maddela; Lingjue Xie; Daniel Preoţiuc-Pietro; Mausam -

STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases

Mounica Maddela, Lingjue Xie, Daniel Preotiuc-Pietro, Mausam

Abstract

Our goal is to assess how well current Text2SQL systems support SQL analysts in their primary work of performing complex analytics on specialized relational databases. Although several benchmarks evaluate Text2SQL models, the complexity of questions (and the output SQL queries) in most datasets is inherently limited – they do not focus on intents involving analytics and reasoning. In response, we present STARQA, the first public human-created dataset focused on complex analytical questions and answers (involving nested joins, time series analytics, statistical operations, and more) on three specialized-domain databases. In addition to standard Text2SQL baselines, we also evaluate a novel approach (Text2SQLCode) that decomposes the task through a combination of SQL and Python: SQL is responsible for data fetch, and Python more naturally performs reasoning. Our results demonstrate that both existing Text2SQL systems and our Text2SQLCode approach find STARQA questions quite challenging, even though Text2SQLCode achieves better performance on the more difficult questions. Further analyses assess the typical errors made by existing systems and charts a research path for pushing the capabilities of real-world systems.

Anthology ID:: 2025.emnlp-main.1749
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 34475–34487
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1749/
DOI:
Bibkey:
Cite (ACL):: Mounica Maddela, Lingjue Xie, Daniel Preotiuc-Pietro, and Mausam. 2025. STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 34475–34487, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases (Maddela et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1749.pdf
Checklist:: 2025.emnlp-main.1749.checklist.pdf

PDF Cite Search Checklist Fix data