Jun Gao

Other people with similar names: Jun Gao

Unverified author pages with similar names: Jun Gao

2026

DataSeer: A Manager-Centric Collaborative Multi-Agent Framework with Multi-Branch Reasoning for Automated Insight Discovery
Suchen Liu | Yuanfeng Song | Jun Gao | Xing Chen
Findings of the Association for Computational Linguistics: ACL 2026

The growth of complex data fuels demand for automated insight discovery. While LLMs and agent technologies have advanced data analysis, existing methods struggle with maintaining contextual coherence, limited coverage due to single-path exploration, and rigid planning that fails to adapt to dynamic data discovery. We propose DataSeer, a collaborative multi-agent framework for automated insight discovery. Our first contribution is a Manager-Centric Collaborative Framework, where the Manager ensures cross-episode contextual coherence through a dual-layer memory system with compression, consolidation, and retrieval, alongside dynamic prompt editing, coordinating the overall process between the Planner and Executor. Second, we optimize the planning and execution components: the Planner employs multi-role discussion for adaptive sub-goal generation and plan refinement; the Executor is endowed with tactical autonomy for exploratory execution and incorporates real-time multi-dimensional self-assessment to guarantee insight quality. Third, we design Multi-Branch Reasoning that executes multiple discovery trajectories and synthesizes outcomes through LLM-based aggregation, improving coverage and reducing single-path stochasticity. Experiments on InsightBench and InsightEval show that DataSeer outperforms baselines, achieving improvements of 18.7% and 12.1% in insight-level scores, and 11.6% and 10.3% in summary-level scores, respectively.

2025

pdf bib abs

FinRAGBench-V: A Benchmark for Multimodal RAG with Visual Citation in the Financial Domain
Suifeng Zhao | Zhuoran Jin | Sujian Li | Jun Gao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Retrieval-Augmented Generation (RAG) plays a vital role in the financial domain, powering applications such as real-time market analysis, trend forecasting, and interest rate computation. However, most existing RAG research in finance focuses predominantly on textual data, overlooking the rich visual content in financial documents, resulting in the loss of key analytical insights. To bridge this gap, we present FinRAGBench-V, a comprehensive visual RAG benchmark tailored for finance. This benchmark effectively integrates multimodal data and provides visual citation to ensure traceability. It includes a bilingual retrieval corpus with 60,780 Chinese and 51,219 English pages, along with a high-quality, human-annotated question-answering (QA) dataset spanning heterogeneous data types and seven question categories. Moreover, we introduce RGenCite, an RAG baseline that seamlessly integrates visual citation with generation. Furthermore, we propose an automatic citation evaluation method to systematically assess the visual citation capabilities of Multimodal Large Language Models (MLLMs). Extensive experiments on RGenCite underscore the challenging nature of FinRAGBench-V, providing valuable insights for the development of multimodal RAG systems in finance.

pdf bib abs

Realistic Training Data Generation and Rule Enhanced Decoding in LLM for NameGuess
Yikuan Xia | Jiazun Chen | Sujian Li | Jun Gao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

The wide use of abbreviated column names (derived from English words or Chinese Pinyin) in database tables poses significant challenges for table-centric tasks in natural language processing and database management. Such a column name expansion task, referred to as the NameGuess task, has previously been addressed by fine-tuning Large Language Models (LLMs) on synthetically generated rule-based data. However, the current approaches yield suboptimal performance due to two fundamental limitations: 1) the rule-generated abbreviation data fails to reflect real-world distribution, and 2) the failure of LLMs to follow the rule-sensitive patterns in NameGuess persistently. For the data realism issue, we propose a novel approach that integrates a subsequence abbreviation generator trained on human-annotated data and collects non-subsequence abbreviations to improve the training set. For the rule violation issue, we propose a decoding system constrained on an automaton that represents the rules of abbreviation expansion. We extended the original English NameGuess test set to include non-subsequence and PinYin scenarios. Experimental results show that properly tuned 7/8B moderate-size LLMs with a refined decoding system can surpass the few-shot performance of state-of-the-art LLMs, such as the GPT-4 series. The code and data are presented in the supplementary material.

Co-authors

Yuanfeng Song 1

Yikuan Xia 1

Suifeng Zhao 1

Venues

EMNLP2
Findings1

Fix author