RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

Yuyang Dai; Yan Lin; Zhuohan Xie; Yuxia Wang

RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

Yuyang Dai, Yan Lin, Zhuohan Xie, Yuxia Wang

Abstract

Reliable financial reasoning requires knowing not only how to answer, but also when an answer cannot be justified. In real financial practice, problems often rely on implicit assumptions that are taken for granted rather than stated explicitly, causing problems to appear solvable while lacking enough information for a definite answer. We introduce RealFin, a bilingual benchmark that evaluates financial reasoning by systematically removing essential premises from exam-style questions while keeping them linguistically plausible. Based on this, we evaluate models under three formulations that test answering, recognizing missing information, and rejecting unjustified options, and find consistent performance drops when key conditions are absent. General-purpose models tend to over-commit and guess, while most finance-specialized models fail to clearly identify missing premises. These results highlight a critical gap in current evaluations and show that reliable financial models must know when a question should not be answered. The dataset and code are available athttps://github.com/insait-institute/RealFin.

Anthology ID:: 2026.findings-acl.1255
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25050–25080
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1255/
DOI:
Bibkey:
Cite (ACL):: Yuyang Dai, Yan Lin, Zhuohan Xie, and Yuxia Wang. 2026. RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?. In Findings of the Association for Computational Linguistics: ACL 2026, pages 25050–25080, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid? (Dai et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1255.pdf
Checklist:: 2026.findings-acl.1255.checklist.pdf

PDF Cite Search Checklist Fix data