Aestar at SemEval-2025 Task 8: Agentic LLMs for Question Answering over Tabular Data

Rishit Tyagi; Mohit Gupta; Rahul Bouri

Aestar at SemEval-2025 Task 8: Agentic LLMs for Question Answering over Tabular Data

Abstract

Question Answering over Tabular Data (Table QA) presents unique challenges due to the diverse structure, size, and data types of real-world tables. The SemEval 2025 Task 8 (DataBench) introduced a benchmark composed of large-scale, domain-diverse datasets to evaluate the ability of models to accurately answer structured queries. We propose a Natural Language to SQL (NL-to-SQL) approach leveraging large language models (LLMs) such as GPT-4o, GPT-4o-mini, and DeepSeek v2:16b to generate SQL queries dynamically. Our system follows a multi-stage pipeline involving example selection, SQL query generation, answer extraction, verification, and iterative refinement. Experiments demonstrate the effectiveness of our approach, achieving 70.5% accuracy on DataBench QA and 71.6% on DataBench Lite QA, significantly surpassing baseline scores of 26% and 27% respectively. This paper details our methodology, experimental results, and alternative approaches, providing insights into the strengths and limitations of LLM-driven Table QA.

Anthology ID:: 2025.semeval-1.292
Volume:: Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2249–2255
Language:
URL:: https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.292/
DOI:
Bibkey:
Cite (ACL):: Rishit Tyagi, Mohit Gupta, and Rahul Bouri. 2025. Aestar at SemEval-2025 Task 8: Agentic LLMs for Question Answering over Tabular Data. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2249–2255, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Aestar at SemEval-2025 Task 8: Agentic LLMs for Question Answering over Tabular Data (Tyagi et al., SemEval 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.292.pdf

PDF Cite Search Fix data