RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation
Sashuai Zhou, Weinan Gan, Qijiong Liu, Ke Lei, Jieming Zhu, Hai Huang, Yan Xia, Ruiming Tang, Zhenhua Dong, Zhou Zhao
Abstract
Recent advances in LLM-based recommendation have shown promise, yet their cross-domain generalization is hindered by a fundamental mismatch between language-centric pretraining and the recommendation task. Existing methods, relying on language-level knowledge, fail to capture dynamic, item-level user interests across domains. To bridge this gap, we propose RecBase, a domain-agnostic foundational model pretrained with a recommendation-oriented objective. RecBase leverages a large-scale, heterogeneous, cross-domain corpus with unified textual representations and feature mappings to enhance cross-domain generalization. To further align item semantics across domains, we introduce a unified item tokenizer that encodes items into hierarchical concept identifiers, enabling structured representation and efficient vocabulary sharing. The model is trained using an autoregressive objective to capture complex item-level sequential patterns. On eight real-world datasets, our 1.5B-parameter model matches or surpasses the performance of LLM baselines up to 7B parameters in zero-shot and cross-domain recommendation tasks.- Anthology ID:
- 2025.emnlp-main.786
- Volume:
- Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 15598–15610
- Language:
- URL:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.786/
- DOI:
- Cite (ACL):
- Sashuai Zhou, Weinan Gan, Qijiong Liu, Ke Lei, Jieming Zhu, Hai Huang, Yan Xia, Ruiming Tang, Zhenhua Dong, and Zhou Zhao. 2025. RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 15598–15610, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation (Zhou et al., EMNLP 2025)
- PDF:
- https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.786.pdf