STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases
Mounica Maddela, Lingjue Xie, Daniel Preotiuc-Pietro, Mausam
Abstract
Our goal is to assess how well current Text2SQL systems support SQL analysts in their primary work of performing complex analytics on specialized relational databases. Although several benchmarks evaluate Text2SQL models, the complexity of questions (and the output SQL queries) in most datasets is inherently limited – they do not focus on intents involving analytics and reasoning. In response, we present STARQA, the first public human-created dataset focused on complex analytical questions and answers (involving nested joins, time series analytics, statistical operations, and more) on three specialized-domain databases. In addition to standard Text2SQL baselines, we also evaluate a novel approach (Text2SQLCode) that decomposes the task through a combination of SQL and Python: SQL is responsible for data fetch, and Python more naturally performs reasoning. Our results demonstrate that both existing Text2SQL systems and our Text2SQLCode approach find STARQA questions quite challenging, even though Text2SQLCode achieves better performance on the more difficult questions. Further analyses assess the typical errors made by existing systems and charts a research path for pushing the capabilities of real-world systems.- Anthology ID:
- 2025.emnlp-main.1749
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 34475–34487
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1749/
- DOI:
- Cite (ACL):
- Mounica Maddela, Lingjue Xie, Daniel Preotiuc-Pietro, and Mausam. 2025. STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 34475–34487, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- STARQA: A Question Answering Dataset for Complex Analytical Reasoning over Structured Databases (Maddela et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1749.pdf