ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation

Atakan Site, Emre Erdemir, Gülşen Eryiğit


Abstract
This paper presents our system for SemEval-2025 Task 8: DataBench, Question-Answeringover Tabular Data. The primary objective ofthis task is to perform question answering ongiven tabular datasets from diverse domains;under two subtasks: DataBench QA (SubtaskI) and DataBench Lite QA (Subtask II). Totackle both subtasks, we developed a zero-shotsolution with a particular emphasis on lever-aging Large Language Model (LLM)-basedcode generation. Specifically, we proposeda Python code generation framework, utiliz-ing state-of-the-art open-source LLMs to gen-erate executable Pandas code via optimizedprompting strategies. Our experiments revealthat different LLMs exhibit varying levels ofeffectiveness in Python code generation. Addi-tionaly, results show that Python code genera-tion achieves superior performance in tabularquestion answering compared to alternative ap-proaches. Although our ranking among zero-shot systems is unknown at the time of this pa-per’s submission, our system achieved eighthplace in Subtask I and sixth place in Subtask IIamong the 30 systems that outperformed thebaseline in the open-source models category.
Anthology ID:
2025.semeval-1.198
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1504–1514
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.198/
DOI:
Bibkey:
Cite (ACL):
Atakan Site, Emre Erdemir, and Gülşen Eryiğit. 2025. ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1504–1514, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
ITUNLP at SemEval-2025 Task 8: Question-Answering over Tabular Data: A Zero-Shot Approach using LLM-Driven Code Generation (Site et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.198.pdf