AlphaPro at SemEval-2025 Task 8: A Code Generation Approach for Question-Answering over Tabular Data

Anshuman Aryan, Laukik Wadhwa, Kalki Eshwar, Aakarsh Sinha, Durgesh Kumar


Abstract
This work outlines the AlphaPro team’s solution to SemEval-2025 Task 8: Question Answering on Tabular Data. Our system utilizes a three-stage pipeline that uses natural language questions along with the table’s structural information to generate executable Python code, which is subsequently used to query the table and produce answers. The method achieves up to 67% accuracy in task data, demonstrating the feasibility of code generation for tabular question answering. The strengths and limitations of the approach are outlined and suggestions for further research are provided. The code has been made available in a public code repository to promote reproducibility and research in this area.
Anthology ID:
2025.semeval-1.307
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2358–2367
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.307/
DOI:
Bibkey:
Cite (ACL):
Anshuman Aryan, Laukik Wadhwa, Kalki Eshwar, Aakarsh Sinha, and Durgesh Kumar. 2025. AlphaPro at SemEval-2025 Task 8: A Code Generation Approach for Question-Answering over Tabular Data. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2358–2367, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
AlphaPro at SemEval-2025 Task 8: A Code Generation Approach for Question-Answering over Tabular Data (Aryan et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.307.pdf