User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems

Jianling Wang, Yifan Liu, Yinghao Sun, Xuejian Ma, Yueqi Wang, He Ma, Zhengyang Su, Minmin Chen, Mingyan Gao, Onkar Dalal, Ed H. Chi, Lichan Hong, Ningren Han, Haokai Lu


Abstract
Exploration, the act of broadening user experiences beyond their established preferences, is challenging in large-scale recommendation systems due to feedback loops and limited signals on user exploration patterns. Large Language Models (LLMs) offer potential solutions by leveraging their world knowledge to recommend novel content outside these loops. A key challenge is aligning LLMs with user preferences while preserving their knowledge and reasoning. To enhance planning for new user interests using LLMs, this paper introduces a novel approach that combines hierarchical planning with LLM inference-time scaling. This method aims to improve recommendation relevancy without compromising novelty. We decouple novelty and user-alignment, training separate LLMs for each objective. We then scale up the novelty-focused LLM’s inference and select the best-of-n predictions using the user-aligned LLM. Live experiments demonstrate efficacy, showing significant gains in both user satisfaction (measured by watch activity and active user counts) and exploration diversity.
Anthology ID:
2025.acl-industry.70
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Georg Rehm, Yunyao Li
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
996–1003
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.acl-industry.70/
DOI:
Bibkey:
Cite (ACL):
Jianling Wang, Yifan Liu, Yinghao Sun, Xuejian Ma, Yueqi Wang, He Ma, Zhengyang Su, Minmin Chen, Mingyan Gao, Onkar Dalal, Ed H. Chi, Lichan Hong, Ningren Han, and Haokai Lu. 2025. User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 996–1003, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
User Feedback Alignment for LLM-powered Exploration in Large-scale Recommendation Systems (Wang et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.acl-industry.70.pdf