Tomas Turek
2025
CSIRO LT at SemEval-2025 Task 8: Answering Questions over Tabular Data using LLMs
Tomas Turek
|
Shakila Mahjabin Tonni
|
Vincent Nguyen
|
Huichen Yang
|
Sarvnaz Karimi
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Question Answering over large tables is challenging due to the difficulty of reasoning required in linking information from different parts of a table, such as heading and metadata to the values in the table and information needs. We investigate using Large Language Models (LLM) for tabular reasoning, where, given a pair of a table and a question from the DataBench benchmark, the models generate answers. We experiment with three techniques that enables symbolic reasoning through code execution: a direct code prompting (DCP) approach, ‘DCP_Py’, which uses Python, multi-step code (MSC) prompting ‘MSC_SQL+FS’ using SQL and ReAct prompting, ‘MSR_Py+FS’, which combines multi-step reasoning (MSR), few-shot (FS) learning and Python tools. We also conduct an analysis exploring the impact of answer types, data size, and multi-column dependencies on LLMs’ answer generation performance, including an assessment of the models’ limitations and the underlying challenges of tabular reasoning in LLMs.