RePanda: Pandas-powered Tabular Verification and Reasoning

Atoosa Chegini, Keivan Rezaei, Hamid Eghbalzadeh, Soheil Feizi


Abstract
Fact-checking tabular data is essential for ensuring the accuracy of structured information in domains such as journalism, finance, and scientific research. However, existing methods often rely on black-box models with opaque reasoning. We introduce RePanda, a structured fact verification approach that translates claims into executable pandas queries, enabling interpretable and verifiable reasoning.To train RePanda, we construct PanTabFact, a structured dataset derived from TabFact, where claims are paired with executable queries generated using DeepSeek-Chat and refined through automated error correction. Fine-tuning DeepSeek-coder-7B-instruct-v1.5 on PanTabFact, RePanda achieves 84.09% accuracy on TabFact. To assess Out-of-Distribution (OOD) generalization, we create a dataset named WikiFact from WikiTableQuestions by transforming question-answer pairs into factual claims. Without additional fine-tuning, RePanda achieves 84.72% accuracy on WikiFact, significantly outperforming all other baselines and demonstrating strong OOD robustness. PanTabFact is publically available on HuggingFace at datasets/AtoosaChegini/PanTabFact.Beyond fact verification, RePanda extends to tabular question answering by generating executable queries that retrieve precise answers. To support this, we introduce PanWiki, a dataset mapping WikiTableQuestions to pandas queries. Fine-tuning on PanWiki, RePanda achieves 75.1% accuracy in direct answer retrieval. These results highlight the effectiveness of structured execution-based reasoning for tabular verification and question answering.
Anthology ID:
2025.acl-long.1549
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
32200–32212
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1549/
DOI:
Bibkey:
Cite (ACL):
Atoosa Chegini, Keivan Rezaei, Hamid Eghbalzadeh, and Soheil Feizi. 2025. RePanda: Pandas-powered Tabular Verification and Reasoning. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 32200–32212, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
RePanda: Pandas-powered Tabular Verification and Reasoning (Chegini et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1549.pdf