Tianhao Peng
2026
MT-Video-Bench: A Holistic Video Understanding Benchmark for Evaluating Multimodal LLMs in Multi-Turn Dialogues
Yaning Pan | Qianqian Xie | Guohui Zhang | Zekun Moore Wang | Yongqian Wen | Yuanxing Zhang | Haoxuan Hu | Zhiyu Pan | Yibing Huang | Zhidong Gan | Yonghong Lin | An Ping | Shihao Li | Yanghai Wang | Tianhao Peng | Jiaheng Liu
Findings of the Association for Computational Linguistics: ACL 2026
Yaning Pan | Qianqian Xie | Guohui Zhang | Zekun Moore Wang | Yongqian Wen | Yuanxing Zhang | Haoxuan Hu | Zhiyu Pan | Yibing Huang | Zhidong Gan | Yonghong Lin | An Ping | Shihao Li | Yanghai Wang | Tianhao Peng | Jiaheng Liu
Findings of the Association for Computational Linguistics: ACL 2026
The recent development of Multimodal Large Language Models (MLLMs) has significantly advanced AI’s ability to understand visual modalities. However, existing evaluation benchmarks remain limited to single-turn question answering, overlooking the complexity of multi-turn dialogues in real-world scenarios. To bridge this gap, we introduce MT-Video-Bench, a holistic video understanding benchmark for evaluating MLLMs in multi-turn dialogues. Specifically, our MT-Video-Bench mainly assesses six core competencies that focus on perceptivity and interactivity, encompassing 1,000 meticulously curated multi-turn dialogues from diverse domains. These capabilities are rigorously aligned with real-world applications, such as interactive sports analysis and multi-turn video-based intelligent tutoring. With MT-Video-Bench, we extensively evaluate various state-of-the-art open-source and closed-source MLLMs, revealing their significant performance discrepancies and limitations in handling multi-turn video dialogues. The benchmark will be publicly available to foster future research.
2025
OAgents: An Empirical Study of Building Effective Agents
He Zhu | Tianrui Qin | King Zhu | Heyuan Huang | Yeyi Guan | Jinxiang Xia | Hanhao Li | Yi Yao | Ningning Wang | Pai Liu | Tianhao Peng | Xin Gui | Li Xiaowan | Yuhui Liu | Xiangru Tang | Jian Yang | Ge Zhang | Xitong Gao | Yuchen Eleanor Jiang | Changwang Zhang | Jun Wang | Jiaheng Liu | Wangchunshu Zhou
Findings of the Association for Computational Linguistics: EMNLP 2025
He Zhu | Tianrui Qin | King Zhu | Heyuan Huang | Yeyi Guan | Jinxiang Xia | Hanhao Li | Yi Yao | Ningning Wang | Pai Liu | Tianhao Peng | Xin Gui | Li Xiaowan | Yuhui Liu | Xiangru Tang | Jian Yang | Ge Zhang | Xitong Gao | Yuchen Eleanor Jiang | Changwang Zhang | Jun Wang | Jiaheng Liu | Wangchunshu Zhou
Findings of the Association for Computational Linguistics: EMNLP 2025
Recently, Agentic AI has become an increasingly popular field of research. However, we argue that current practices on agent research are far from standard, rigorous scientific research, which makes it hard to conduct apples-to-apples comparisons among and against existing methods. As a result, it is still obscure how different design choices in an agent framework impact its effectiveness, and measuring progress on agent research remains very hard. In this work, we conduct a systematic empirical study on the GAIA benchmark to investigate the impact of different popular design choices within key agent components in a fair and rigorous way. To begin with, we find that the lack of a standard evaluation protocol makes previous works, even the open-sourced ones, not reproducible, and the variance between different random runs is often non-negligible. Therefore, we first introduce a more robust evaluation protocol to make comparisons more stable. Our empirical study then unveils which components and designs, as well as correlations between these designs, are the keys for building effective agents, while others are not and redundant, despite seemingly making sense. With the insights gained from our empirical study, we build and open-source OAgents, a new foundation agent framework that achieves state-of-the-art performance among open-source projects, providing a good starting point and guidelines for building effective agents. More importantly, supports various design choices for agent components in a modularized way, facilitating future scientific research on Agentic AI.
2024
Soft Knowledge Prompt: Help External Knowledge Become a Better Teacher to Instruct LLM in Knowledge-based VQA
Qunbo Wang | Ruyi Ji | Tianhao Peng | Wenjun Wu | Zechao Li | Jing Liu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Qunbo Wang | Ruyi Ji | Tianhao Peng | Wenjun Wu | Zechao Li | Jing Liu
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
LLM has achieved impressive performance on multi-modal tasks, which have received ever-increasing research attention. Recent research focuses on improving prediction performance and reliability (e.g., addressing the hallucination problem). They often prepend relevant external knowledge to the input text as an extra prompt. However, these methods would be affected by the noise in the knowledge and the context length limitation of LLM. In our work, we focus on making better use of external knowledge and propose a method to actively extract valuable information in the knowledge to produce the latent vector as a soft prompt, which is then fused with the image embedding to form a knowledge-enhanced context to instruct LLM. The experimental results on knowledge-based VQA benchmarks show that the proposed method enjoys better utilization of external knowledge and helps the model achieve better performance.
Search
Fix author
Co-authors
- Jiaheng Liu 2
- Zhidong Gan 1
- Xitong Gao 1
- Yeyi Guan 1
- Xin Gui 1
- Haoxuan Hu 1
- Heyuan Huang 1
- Yibing Huang 1
- Ruyi Ji 1
- Yuchen Eleanor Jiang 1
- Hanhao Li 1
- Zechao Li 1
- Shihao Li 1
- Yonghong Lin 1
- Pai Liu 1
- Yuhui Liu 1
- Jing Liu (刘晶, 刘璟) 1
- Yaning Pan 1
- Zhiyu Pan 1
- An Ping 1
- Tianrui Qin 1
- Xiangru Tang 1
- Ningning Wang 1
- Jun Wang 1
- Qunbo Wang 1
- Zekun Moore Wang 1
- Yanghai Wang 1
- Yongqian Wen 1
- Wenjun Wu 1
- Jinxiang Xia 1
- Li Xiaowan 1
- Qianqian Xie 1
- Jian Yang 1
- Yi Yao 1
- Ge Zhang 1
- Changwang Zhang 1
- Guohui Zhang 1
- Yuanxing Zhang 1
- Wangchunshu Zhou 1
- He Zhu 1
- King Zhu 1