Mingyang Jiang
2026
MultiFinBen: Benchmarking Large Language Models for Multilingual and Multimodal Financial Application
Xueqing Peng | Lingfei Qian | Yan Wang | Ruoyu Xiang | Yueru He | Yang Ren | Mingyang Jiang | Vincent Jim Zhang | Yuqing Guo | Jeff Zhao | Huan He | Yi Han | Yun Feng | Yuechen Jiang | Yupeng Cao | Haohang Li | Yangyang Yu | Xiaoyu Wang | Penglei Gao | Shengyuan Lin | Keyi Wang | Shanshan Yang | Yilun Zhao | Zhiwei Liu | Peng Lu | Jerry Huang | Suyuchen Wang | Triantafillos Papadopoulos | Polydoros Giannouris | Efstathia Soufleri | Nuo Chen | Zhiyang Deng | Heming Fu | Yijia Zhao | Mingquan Lin | Meikang Qiu | Kaleb E Smith | Arman Cohan | Xiao-Yang Liu | Jimin Huang | Guojun Xiong | Alejandro Lopez-Lira | Xi Chen | Junichi Tsujii | Jian-Yun Nie | Sophia Ananiadou | Qianqian Xie
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xueqing Peng | Lingfei Qian | Yan Wang | Ruoyu Xiang | Yueru He | Yang Ren | Mingyang Jiang | Vincent Jim Zhang | Yuqing Guo | Jeff Zhao | Huan He | Yi Han | Yun Feng | Yuechen Jiang | Yupeng Cao | Haohang Li | Yangyang Yu | Xiaoyu Wang | Penglei Gao | Shengyuan Lin | Keyi Wang | Shanshan Yang | Yilun Zhao | Zhiwei Liu | Peng Lu | Jerry Huang | Suyuchen Wang | Triantafillos Papadopoulos | Polydoros Giannouris | Efstathia Soufleri | Nuo Chen | Zhiyang Deng | Heming Fu | Yijia Zhao | Mingquan Lin | Meikang Qiu | Kaleb E Smith | Arman Cohan | Xiao-Yang Liu | Jimin Huang | Guojun Xiong | Alejandro Lopez-Lira | Xi Chen | Junichi Tsujii | Jian-Yun Nie | Sophia Ananiadou | Qianqian Xie
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Real-world financial analysis involves information across multiple languages and modalities, from reports and news to scanned filings and meeting recordings. Yet most existing evaluations of LLMs in finance remain text-only, monolingual, and largely saturated by current models. To bridge these gaps, we present MultiFinBen, the first expert-annotated multilingual (five languages) and multimodal (text, vision, audio) benchmark for evaluating LLMs in realistic financial contexts. MultiFinBen introduces two new task families: multilingual financial reasoning, which tests cross-lingual evidence integration from filings and news, and financial OCR, which extracts structured text from scanned documents containing tables and charts. Rather than aggregating all available datasets, we apply a structured, difficulty-aware selection based on advanced model performance, ensuring balanced challenge and removing redundant tasks. Evaluating 21 leading LLMs shows that even frontier multimodal models like GPT-4o achieve only 46.01% overall, stronger on vision and audio but dropping sharply in multilingual settings. These findings expose persistent limitations in multilingual, multimodal, and expert-level financial reasoning. All datasets, evaluation scripts, and leaderboards are publicly released.
Search
Fix author
Co-authors
- Sophia Ananiadou 1
- Yupeng Cao 1
- Nuo Chen 1
- Xi Chen 1
- Arman Cohan 1
- Zhiyang Deng 1
- Yun Feng 1
- Heming Fu 1
- Penglei Gao 1
- Polydoros Giannouris 1
- Yuqing Guo 1
- Yi Han 1
- Yueru He 1
- Huan He 1
- Jerry Huang 1
- Jimin Huang 1
- Yuechen Jiang 1
- Haohang Li 1
- Shengyuan Lin 1
- Mingquan Lin 1
- Zhiwei Liu 1
- Xiao-Yang Liu 1
- Alejandro Lopez-Lira 1
- Peng Lu 1
- Jian-Yun Nie 1
- Triantafillos Papadopoulos 1
- Xueqing Peng 1
- Lingfei Qian 1
- Meikang Qiu 1
- Yang Ren 1
- Kaleb E. Smith 1
- Efstathia Soufleri 1
- Jun’ichi Tsujii 1
- Yan Wang 1
- Xiaoyu Wang 1
- Keyi Wang 1
- Suyuchen Wang 1
- Ruoyu Xiang 1
- Qianqian Xie 1
- Guojun Xiong 1
- Shanshan Yang 1
- Yangyang Yu 1
- Vincent Jim Zhang 1
- Jeff Zhao 1
- Yilun Zhao 1
- Yijia Zhao 1
Venues
- ACL1