Jiexiang Wang
2026
FinMRAGBench: A Realistic and Complex Benchmark for Multi-Modal RAG in Financial Document Analysis
Shouqing Yang | Qi Zhang | Yuhang Yang | Ruikang Xu | Yuwei Hou | Zhulin Jia | Lirong Gao | Haobo Wang | Jinglei Chen | Jiexiang Wang | Sheng Guo | Bo Zheng | Gang Chen
Findings of the Association for Computational Linguistics: ACL 2026
Shouqing Yang | Qi Zhang | Yuhang Yang | Ruikang Xu | Yuwei Hou | Zhulin Jia | Lirong Gao | Haobo Wang | Jinglei Chen | Jiexiang Wang | Sheng Guo | Bo Zheng | Gang Chen
Findings of the Association for Computational Linguistics: ACL 2026
Retrieval-augmented generation (RAG) has become a widely adopted paradigm for realistic financial analysis over financial documents. However, existing benchmarks fail to capture realistic financial analysis settings that involve cross-document retrieval, multi-page evidence integration, and diverse analytical tasks. To address this gap, we introduce FinMRAGBench, a comprehensive multi-modal financial RAG benchmark in which most questions require retrieving evidence scattered across multiple pages and documents, constructed from large-scale real-world annual reports and comprising 887 expert-verified QA pairs spanning five representative financial analysis tasks. Moreover, we introduce FinMRAGAgent, an agent trained on high-quality agentic trajectories following the reasoning-and-acting (ReAct) paradigm, capable of dynamic tool invocation and multi-step financial analysis. Our extensive experiments show that current multi-modal RAG systems still struggle with incomplete retrieval and complex financial reasoning. In contrast, FinMRAGAgent achieves the strongest overall performance across all models, demonstrating that our structured reasoning approach significantly enhances multi-modal RAG in realistic financial scenarios. The code and data are available at https://github.com/sqyangit/FinMRAGBench.
2025
LeTS: Learning to Think-and-Search via Process-and-Outcome Reward Hybridization
Qi Zhang | Shouqing Yang | Lirong Gao | Hao Chen | Xiaomeng Hu | Jinglei Chen | Jiexiang Wang | Sheng Guo | Bo Zheng | Haobo Wang | Junbo Zhao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Qi Zhang | Shouqing Yang | Lirong Gao | Hao Chen | Xiaomeng Hu | Jinglei Chen | Jiexiang Wang | Sheng Guo | Bo Zheng | Haobo Wang | Junbo Zhao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large language models (LLMs) have demonstrated impressive capabilities in reasoning with the emergence of reasoning models like OpenAI-o1 and DeepSeek-R1. Recent research focuses on integrating reasoning capabilities into the realm of retrieval-augmented generation (RAG) via outcome-supervised reinforcement learning (RL) approaches, while the correctness of intermediate think-and-search steps is usually neglected. To address this issue, we design a process-level reward module to mitigate the unawareness of intermediate reasoning steps in outcome-level supervision without additional annotation. Grounded on this, we propose **Le**arning to **T**hink-and-**S**earch (**LeTS**), a novel framework that hybridizes stepwise process reward and outcome-based reward to current RL methods for RAG. Extensive experiments demonstrate the generalization and inference efficiency of **LeTS** across various RAG benchmarks. In addition, these results reveal the potential of process- and outcome-level reward hybridization in boosting LLMs’ reasoning ability via RL under other scenarios.