Margarita Trofimova

2025

pdf bib abs
TabaQA at SemEval-2025 Task 8: Column Augmented Generation for Question Answering over Tabular Data
Ekaterina Antropova | Egor Kratkov | Roman Derunets | Margarita Trofimova | Ivan Bondarenko | Alexander Panchenko | Vasily Konovalov | Maksim Savkin
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

The DataBench shared task in the SemEval-2025 competition aims to tackle the problem of QA from data in tables. Given the diversity of the structure of tables, there are different approaches to retrieving the answer. Although Retrieval-Augmented Generation (RAG) is a viable solution, extracting relevant information from tables remains challenging. In addition, the table can be prohibitively large for direct integration into the LLM context. In this paper, we address QA over tabular data first by identifying relevant columns that might contain the answers, then the LLM generates answers by providing the context of the relevant columns, and finally, the LLM refines its answers. This approach secured us 7th place in the DataBench lite category.

Co-authors

Alexander Panchenko 1

Maksim Savkin 1

Venues

semeval1
ws1

Fix author