Yifei Hu
2026
OneRec-Think: In-Text Reasoning for Generative Recommendation
Zhanyu Liu | Shiyao Wang | Xingmei Wang | Rongzhou Zhang | Jiaxin Deng | Honghui Bao | Jinghao Zhang | Wuchao Li | PengFei Zheng | Xiangyu Wu | Yifei Hu | Qigen Hu | Xinchen Luo | Lejian Ren | Zhang Zixing | Qianqian Wang | Kuo Cai | Yunfan Wu | Hongtao Cheng | Zexuan Cheng | Lu Ren | Huanjie Wang | Yi Su | Ruiming Tang | Kun Gai | Guorui Zhou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhanyu Liu | Shiyao Wang | Xingmei Wang | Rongzhou Zhang | Jiaxin Deng | Honghui Bao | Jinghao Zhang | Wuchao Li | PengFei Zheng | Xiangyu Wu | Yifei Hu | Qigen Hu | Xinchen Luo | Lejian Ren | Zhang Zixing | Qianqian Wang | Kuo Cai | Yunfan Wu | Hongtao Cheng | Zexuan Cheng | Lu Ren | Huanjie Wang | Yi Su | Ruiming Tang | Kun Gai | Guorui Zhou
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The powerful generative capacity of Large Language Models (LLMs) has instigated a paradigm shift in recommendation. However, existing generative models (e.g., OneRec) operate as implicit predictors, critically lacking the capacity for explicit and controllable reasoning—a key advantage of LLMs. To bridge this gap, we propose OneRec-Think, a unified framework that seamlessly integrates dialogue, reasoning, and personalized recommendation. OneRec-Think incorporates: (1) Itemic Alignment: cross-modal Item-Textual Alignment for semantic grounding; (2) Reasoning Activation: Reasoning Scaffolding to activate LLM reasoning within the recommendation context; and (3) Reasoning Enhancement, where we design a recommendation-specific reward function that accounts for the multi-validity nature of user preferences. Experiments across public benchmarks show state-of-the-art performance. Moreover, our proposed "Think-Ahead" architecture enables effective industrial deployment, achieving a 0.159% gain in APP Stay Time and validating the practical efficacy of the model’s explicit reasoning capability.
2025
AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving
Kangan Qian | Sicong Jiang | Yang Zhong | Ziang Luo | Zilin Huang | Tianze Zhu | Kun Jiang | Mengmeng Yang | Zheng Fu | Jinyu Miao | Yining Shi | He Zhe Lim | Li Liu | Tianbao Zhou | Hongyi Wang | Huang Yu | Yifei Hu | Guang Li | Guang Chen | Hao Ye | Lijun Sun | Diange Yang
Findings of the Association for Computational Linguistics: EMNLP 2025
Kangan Qian | Sicong Jiang | Yang Zhong | Ziang Luo | Zilin Huang | Tianze Zhu | Kun Jiang | Mengmeng Yang | Zheng Fu | Jinyu Miao | Yining Shi | He Zhe Lim | Li Liu | Tianbao Zhou | Hongyi Wang | Huang Yu | Yifei Hu | Guang Li | Guang Chen | Hao Ye | Lijun Sun | Diange Yang
Findings of the Association for Computational Linguistics: EMNLP 2025
Vision-Language Models (VLMs) show promise for autonomous driving, yet their struggle with hallucinations, inefficient reasoning, and limited real-world validation hinders accurate perception and robust step-by-step reasoning. To overcome this, we introduce AgentThink, a pioneering unified framework that, for the first time, integrates Chain-of-Thought (CoT) reasoning with dynamic, agent-style tool invocation for autonomous driving tasks. AgentThink’s core innovations include: (i) Structured Data Generation, by establishing an autonomous driving tool library to automatically construct structured, self-verified reasoning data explicitly incorporating tool usage for diverse driving scenarios; (ii) A Two-stage Training Pipeline, employing Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO) to equip VLMs with the capability for autonomous tool invocation; and (iii) Agent-style Tool-Usage Evaluation, introducing a novel multi-tool assessment protocol to rigorously evaluate the model’s tool invocation and utilization. Experiments on the DriveLMM-o1 benchmark demonstrate AgentThink significantly boosts overall reasoning scores by 53.91% and enhances answer accuracy by 33.54%, while markedly improving reasoning quality and consistency. Furthermore, ablation studies and robust zero-shot/few-shot generalization experiments across various benchmarks underscore its powerful capabilities. These findings highlight a promising trajectory for developing trustworthy and tool-aware autonomous driving models.
Search
Fix author
Co-authors
- Honghui Bao 1
- Kuo Cai 1
- Guang Chen 1
- Hongtao Cheng 1
- Zexuan Cheng 1
- Jiaxin Deng 1
- Zheng Fu 1
- Kun Gai 1
- Qigen Hu 1
- Zilin Huang 1
- Kun Jiang 1
- Sicong Jiang 1
- Guang Li 1
- Wuchao Li 1
- He Zhe Lim 1
- Li Liu 1
- Zhanyu Liu 1
- Xinchen Luo 1
- Ziang Luo 1
- Jinyu Miao 1
- Kangan Qian 1
- Lejian Ren 1
- Lu Ren 1
- Yining Shi 1
- Yi Su 1
- Lijun Sun 1
- Ruiming Tang 1
- Hongyi Wang 1
- Huanjie Wang 1
- Qianqian Wang 1
- Shiyao Wang 1
- Xingmei Wang 1
- Xiangyu Wu 1
- Yunfan Wu 1
- Diange Yang 1
- Mengmeng Yang 1
- Hao Ye 1
- Huang Yu 1
- Jinghao Zhang 1
- Rongzhou Zhang 1
- PengFei Zheng 1
- Yang Zhong 1
- Guorui Zhou 1
- Tianbao Zhou 1
- Tianze Zhu 1
- Zhang Zixing 1