Leveraging Unpaired Feedback for Long-Term LLM-based Recommendation Tuning
Jizhi Zhang, Chongming Gao, Wentao Shi, Xin Chen, Jingang Wang, Xunliang Cai, Fuli Feng
Abstract
Most recommender systems focus on short-term objectives such as click-through rate, often at the expense of long-term user satisfaction. This can lead to echo chambers, where users are repeatedly exposed to redundant content. While recent efforts integrate Large Language Models (LLMs) into recommendation, they typically inherit this short-sighted focus. In this work, we highlight unpaired feedback—implicit signals such as continued engagement (positive) or silent disengagement (negative) that lack explicit contrastive labels—as a key challenge for long-term recommendation. Effectively learning from such feedback is crucial for improving LLM-based recommenders in dynamic user environments. To this end, we propose ULRec (Unpaired Feedback for Long-Term LLM-based Recommendation Tuning), a simple framework that fine-tunes LLMs using both positive and negative unpaired feedback. ULRec leverages the KTO algorithm to incorporate these signals without requiring paired supervision. Despite its simplicity, ULRec consistently improves long-term recommendation performance, demonstrating the value of modeling unpaired user feedback.- Anthology ID:
- 2025.findings-emnlp.1332
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 24507–24521
- Language:
- URL:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1332/
- DOI:
- 10.18653/v1/2025.findings-emnlp.1332
- Cite (ACL):
- Jizhi Zhang, Chongming Gao, Wentao Shi, Xin Chen, Jingang Wang, Xunliang Cai, and Fuli Feng. 2025. Leveraging Unpaired Feedback for Long-Term LLM-based Recommendation Tuning. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 24507–24521, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Leveraging Unpaired Feedback for Long-Term LLM-based Recommendation Tuning (Zhang et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1332.pdf