AlphaPro at SemEval-2025 Task 8: A Code Generation Approach for Question-Answering over Tabular Data
Anshuman Aryan, Laukik Wadhwa, Kalki Eshwar, Aakarsh Sinha, Durgesh Kumar
Abstract
This work outlines the AlphaPro team’s solution to SemEval-2025 Task 8: Question Answering on Tabular Data. Our system utilizes a three-stage pipeline that uses natural language questions along with the table’s structural information to generate executable Python code, which is subsequently used to query the table and produce answers. The method achieves up to 67% accuracy in task data, demonstrating the feasibility of code generation for tabular question answering. The strengths and limitations of the approach are outlined and suggestions for further research are provided. The code has been made available in a public code repository to promote reproducibility and research in this area.- Anthology ID:
- 2025.semeval-1.307
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2358–2367
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.307/
- DOI:
- Cite (ACL):
- Anshuman Aryan, Laukik Wadhwa, Kalki Eshwar, Aakarsh Sinha, and Durgesh Kumar. 2025. AlphaPro at SemEval-2025 Task 8: A Code Generation Approach for Question-Answering over Tabular Data. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2358–2367, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- AlphaPro at SemEval-2025 Task 8: A Code Generation Approach for Question-Answering over Tabular Data (Aryan et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.307.pdf