UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Yuzhe Yang; Yifei Zhang; Yan Hu; Yilin Guo; Ruoli Gan; Yueru He; Mingcong Lei; Xiao Zhang (张晓); Haining Wang; Qianqian Xie; Jimin Huang; Honghai Yu; Benyou Wang

doi:10.18653/v1/2025.findings-naacl.300

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Yuzhe Yang, Yifei Zhang, Yan Hu, Yilin Guo, Ruoli Gan, Yueru He, Mingcong Lei, Xiao Zhang, Haining Wang, Qianqian Xie, Jimin Huang, Honghai Yu, Benyou Wang

Abstract

This paper introduces the UCFE: User-Centric Financial Expertise benchmark, an innovative framework designed to evaluate the ability of large language models (LLMs) to handle complex real-world financial tasks. UCFE benchmark adopts a hybrid approach that combines human expert evaluations with dynamic, task-specific interactions to simulate the complexities of evolving financial scenarios. Firstly, we conducted a user study involving 804 participants, collecting their feedback on financial tasks. Secondly, based on this feedback, we created our dataset that encompasses a wide range of user intents and interactions. This dataset serves as the foundation for benchmarking 11 LLMs services using the LLM-as-Judge methodology. Our results show a significant alignment between benchmark scores and human preferences, with a Pearson correlation coefficient of 0.78, confirming the effectiveness of the UCFE dataset and our evaluation approach. UCFE benchmark not only reveals the potential of LLMs in the financial domain but also provides a robust framework for assessing their performance and user satisfaction.

Anthology ID:: 2025.findings-naacl.300
Volume:: Findings of the Association for Computational Linguistics: NAACL 2025
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5429–5448
Language:
URL:: https://preview.aclanthology.org/moar-dois/2025.findings-naacl.300/
DOI:: 10.18653/v1/2025.findings-naacl.300
Bibkey:
Cite (ACL):: Yuzhe Yang, Yifei Zhang, Yan Hu, Yilin Guo, Ruoli Gan, Yueru He, Mingcong Lei, Xiao Zhang, Haining Wang, Qianqian Xie, Jimin Huang, Honghai Yu, and Benyou Wang. 2025. UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 5429–5448, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models (Yang et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/moar-dois/2025.findings-naacl.300.pdf

PDF Cite Search Fix data