Firefly Team at SemEval-2025 Task 8: Question-Answering over Tabular Data using SQL/Python generation with Closed-Source Large Language Models
Ho Thuy Nga, Ho Thi Thanh Tuyen, Le Minh Hung, Dang Van Thin
Abstract
In this paper, we describe our official system of the Firefly team for two main tasks in the SemEval-2025 Task 8: Question-Answering over Tabular Data. Our solution employs large language models (LLMs) to translate natural language queries into executable code, specifically Python and SQL, which are then used to generate answers categorized into five predefined types. Our empirical evaluation highlights the superiority of Python code generation over SQL for this challenge. Besides, the experimental results show that our system has achieved competitive performance in two subtasks. In Subtask I: Databench QA, where we rank the Top 9 across datasets of any size. Besides, our solution achieved competitive results and ranked 5th place in Subtask II: Databench QA Lite, where datasets are restricted to a maximum of 20 rows.- Anthology ID:
- 2025.semeval-1.136
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1028–1033
- Language:
- URL:
- https://preview.aclanthology.org/fix___bootstrap-utility-classes/2025.semeval-1.136/
- DOI:
- Cite (ACL):
- Ho Thuy Nga, Ho Thi Thanh Tuyen, Le Minh Hung, and Dang Van Thin. 2025. Firefly Team at SemEval-2025 Task 8: Question-Answering over Tabular Data using SQL/Python generation with Closed-Source Large Language Models. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1028–1033, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Firefly Team at SemEval-2025 Task 8: Question-Answering over Tabular Data using SQL/Python generation with Closed-Source Large Language Models (Nga et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/fix___bootstrap-utility-classes/2025.semeval-1.136.pdf