ARQA: A Benchmark for Grounded Table–Text QA in Enterprise Annual Reports

Ruilong Wang; Simone Balloccu

ARQA: A Benchmark for Grounded Table–Text QA in Enterprise Annual Reports

Abstract

Annual reports communicate corporate performance to stakeholders through dense tables and explanatory text, with rich grounding signals making automated reasoning challenging. Existing QA benchmarks focus on retrieval or single-modality reasoning and rarely require justification for answers with both textual and tabular evidence. We introduce ARQA (Annual Report QA), a benchmark of ~2.5K QA pairs spanning ten fiscal years of automotive enterprise annual reports and three reasoning families — Lookup, Arithmetic, and Insight. Data are produced via a planner–generator pipeline, deterministically verified and recomputed, and fully reviewed by domain experts. We evaluate state-of-the-art instruction-tuned language models on ARQA, showing strong factual retrieval but persistent weaknesses in grounded arithmetic and causal reasoning. We release ARQA and its evaluation toolkit to facilitate research on auditable, evidence-first reasoning over enterprise documents. (https://github.com/RuilongWang/ARQA-Benchmark/)

Anthology ID:: 2026.eacl-industry.63
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 847–868
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.63/
DOI:
Bibkey:
Cite (ACL):: Ruilong Wang and Simone Balloccu. 2026. ARQA: A Benchmark for Grounded Table–Text QA in Enterprise Annual Reports. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 847–868, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: ARQA: A Benchmark for Grounded Table–Text QA in Enterprise Annual Reports (Wang & Balloccu, EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.63.pdf

PDF Cite Search Fix data