Mingfu Liang
2026
ReasonRec: A Reasoning-Augmented Multimodal Agent for Unified Recommendation
Yihua Zhang | Mingfu Liang | Jiyan Yang | Rong Jin | Wen-Yen Chen | Yiping Han | Huayu Li | Buyun Zhang | Liang Luo | Luke Simon | Sijia Liu | Tianlong Chen | Xi Liu
Findings of the Association for Computational Linguistics: ACL 2026
Yihua Zhang | Mingfu Liang | Jiyan Yang | Rong Jin | Wen-Yen Chen | Yiping Han | Huayu Li | Buyun Zhang | Liang Luo | Luke Simon | Sijia Liu | Tianlong Chen | Xi Liu
Findings of the Association for Computational Linguistics: ACL 2026
Recent advances in multimodal recommenders excel at feature fusion but remain opaque and inefficient decision-makers, lacking explicit reasoning and self-awareness of uncertainty. To address this, we introduce ReasonRec, a reasoning-augmented multimodal agent structured around a three-stage explicit reasoning pipeline: Observe, via a pretrained Vision-Language Model (VLM) encoder; Deliberate, by formulating recommendation as chain-of-thought (CoT) reasoning tasks and explicitly quantifying prediction uncertainty through an evidence-horizon-aware curriculum; and Act, through dynamic delegation of uncertain or challenging queries to lightweight classical recommendation models. Specifically, we propose a reasoning-aware visual instruction tuning strategy that systematically transforms diverse recommendation tasks into unified CoT prompts, enabling the VLM to explicitly articulate intermediate decision steps. Additionally, our evidence-horizon curriculum progressively enhances the reasoning complexity to better handle cold-start and long-tail user scenarios, significantly boosting model generalization. Furthermore, the uncertainty-guided delegation mechanism empowers the agent to assess its own confidence, strategically allocating computational resources to optimize both recommendation accuracy and inference efficiency. Comprehensive experiments on four standard recommendation tasks (sequential recommendation, direct recommendation, CTR prediction, and explanation generation) across five real-world datasets demonstrate that ReasonRec achieves over 30% relative improvement in key ranking metrics (e.g., HR@5, NDCG@5) compared to state-of-the-art multimodal recommenders. Crucially, ReasonRec substantially reduces inference latency by dynamically delegating up to 35% of queries to efficient sub-models without compromising accuracy. Extensive ablation studies further confirm that each proposed reasoning and planning mechanism individually contributes substantially to ReasonRec’s overall effectiveness. Collectively, our results illustrate a clear pathway towards interpretable, adaptive, and efficient multimodal recommendation through explicit reasoning and agentic design.
2025
The Efficiency vs. Accuracy Trade-off: Optimizing RAG-Enhanced LLM Recommender Systems Using Multi-Head Early Exit
Huixue Zhou | Hengrui Gu | Zaifu Zhan | Xi Liu | Kaixiong Zhou | Yongkang Xiao | Mingfu Liang | Srinivas Prasad Govindan | Piyush Chawla | Jiyan Yang | Xiangfei Meng | Huayu Li | Buyun Zhang | Liang Luo | Wen-Yen Chen | Yiping Han | Bo Long | Rui Zhang | Tianlong Chen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Huixue Zhou | Hengrui Gu | Zaifu Zhan | Xi Liu | Kaixiong Zhou | Yongkang Xiao | Mingfu Liang | Srinivas Prasad Govindan | Piyush Chawla | Jiyan Yang | Xiangfei Meng | Huayu Li | Buyun Zhang | Liang Luo | Wen-Yen Chen | Yiping Han | Bo Long | Rui Zhang | Tianlong Chen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The deployment of Large Language Models (LLMs) in recommender systems for Click-Through Rate (CTR) prediction requires a careful balance between computational efficiency and predictive accuracy. This paper introduces OptiRAG-Rec, a comprehensive framework that integrates Retrieval-Augmented Generation (RAG) with a novel multi-head early exit architecture to address both challenges. By leveraging Graph Convolutional Networks (GCNs) as efficient retrieval mechanisms, the framework significantly reduces data retrieval times while maintaining high model performance. Additionally, the multi-head early exit strategy dynamically terminates inference based on real-time predictive confidence assessments, enhancing responsiveness without sacrificing accuracy. Experimental results demonstrate that OptiRAG-Rec reduces computation time while preserving the precision required for reliable recommendations, establishing a new benchmark for efficient and accurate LLM deployment in recommendation.
AssoCiAm: A Benchmark for Evaluating Association Thinking while Circumventing Ambiguity
Yifan Liu | Wenkuan Zhao | Shanshan Zhong | Jinghui Qin | Mingfu Liang | Zhongzhan Huang | Wushao Wen
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yifan Liu | Wenkuan Zhao | Shanshan Zhong | Jinghui Qin | Mingfu Liang | Zhongzhan Huang | Wushao Wen
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Recent advancements in multimodal large language models (MLLMs) have garnered significant attention, offering a promising pathway toward artificial general intelligence (AGI). Among the essential capabilities required for AGI, creativity has emerged as a critical trait for MLLMs, with association serving as its foundation. Association reflects a model’s ability to think creatively, making it vital to evaluate and understand. While several frameworks have been proposed to assess associative ability, they often overlook the inherent ambiguity in association tasks, which arises from the divergent nature of associations and undermines the reliability of evaluations. To address this issue, we decompose ambiguity into two types—internal ambiguity and external ambiguity—and introduce AssoCiAm, a benchmark designed to evaluate associative ability while circumventing the ambiguity through a hybrid computational method. We then conduct extensive experiments on MLLMs, revealing a strong positive correlation between cognition and association. Additionally, we observe that the presence of ambiguity in the evaluation process causes MLLMs’ behavior to become more random-like. Finally, we validate the effectiveness of our method in ensuring more accurate and reliable evaluations. See Project Page for the data and codes.
Search
Fix author
Co-authors
- Wen-Yen Chen 2
- Tianlong Chen 2
- Yiping Han 2
- Huayu Li 2
- Xi Liu 2
- Liang Luo 2
- Jiyan Yang 2
- Buyun Zhang 2
- Piyush Chawla 1
- Srinivas Prasad Govindan 1
- Hengrui Gu 1
- Zhongzhan Huang 1
- Rong Jin 1
- Sijia Liu 1
- Yifan Liu 1
- Bo Long 1
- Xiangfei Meng 1
- Jinghui Qin 1
- Luke Simon 1
- Wushao Wen 1
- Yongkang Xiao 1
- Zaifu Zhan 1
- Yihua Zhang 1
- Rui Zhang 1
- Wenkuan Zhao 1
- Shanshan Zhong 1
- Huixue Zhou 1
- Kaixiong Zhou 1