Yangyang Li
2025
Tunable LLM-based Proactive Recommendation Agent
Mingze Wang
|
Chongming Gao
|
Wenjie Wang
|
Yangyang Li
|
Fuli Feng
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recommender systems are indispensable on various digital platforms. However, traditional methods often reinforce existing user interests, which leads to echo chambers and limits diversity. Proactive Recommendation Systems (PRS) aim to address this issue by cultivating users’ latent interests through multi-step recommendations. Despite advancements, challenges persist particularly in optimizing long-term rewards and adapting to real-time user feedback. In this study, we propose an LLM-based Actor-Critic Agent framework to enhance PRS. This framework utilizes the LLM-based agent to adjust recommendations in real time based on feedback and employs agent-tuning methods to optimize long-term rewards using three proposed reward functions. Extensive experiments validate the significant superiority of this framework over existing methods by optimizing long-term rewards and dynamically evolving with user feedback.
2024
Dual-Phase Accelerated Prompt Optimization
Muchen Yang
|
Moxin Li
|
Yongle Li
|
Zijun Chen
|
Chongming Gao
|
Junqi Zhang
|
Yangyang Li
|
Fuli Feng
Findings of the Association for Computational Linguistics: EMNLP 2024
Gradient-free prompt optimization methods have made significant strides in enhancing the performance of closed-source Large Language Model (LLMs) across a wide range of tasks. However, existing approaches make light of the importance of high-quality prompt initialization and the identification of effective optimization directions, thus resulting in substantial optimization steps to obtain satisfactory performance. In this light, we aim to accelerate prompt optimization process to tackle the challenge of low convergence rate. We propose a dual-phase approach which starts with generating high-quality initial prompts by adopting a well-designed meta-instruction to delve into task-specific information, and iteratively optimize the prompts at the sentence level, leveraging previous tuning experience to expand prompt candidates and accept effective ones. Extensive experiments on eight datasets demonstrate the effectiveness of our proposed method, achieving a consistent accuracy gain over baselines with less than five optimization steps.