RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation

Sashuai Zhou; Weinan Gan; Qijiong Liu; Ke Lei; Jieming Zhu; Hai Huang; Yan Xia; Ruiming Tang; Zhenhua Dong; Zhou Zhao

RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation

Sashuai Zhou, Weinan Gan, Qijiong Liu, Ke Lei, Jieming Zhu, Hai Huang, Yan Xia, Ruiming Tang, Zhenhua Dong, Zhou Zhao

Abstract

Recent advances in LLM-based recommendation have shown promise, yet their cross-domain generalization is hindered by a fundamental mismatch between language-centric pretraining and the recommendation task. Existing methods, relying on language-level knowledge, fail to capture dynamic, item-level user interests across domains. To bridge this gap, we propose RecBase, a domain-agnostic foundational model pretrained with a recommendation-oriented objective. RecBase leverages a large-scale, heterogeneous, cross-domain corpus with unified textual representations and feature mappings to enhance cross-domain generalization. To further align item semantics across domains, we introduce a unified item tokenizer that encodes items into hierarchical concept identifiers, enabling structured representation and efficient vocabulary sharing. The model is trained using an autoregressive objective to capture complex item-level sequential patterns. On eight real-world datasets, our 1.5B-parameter model matches or surpasses the performance of LLM baselines up to 7B parameters in zero-shot and cross-domain recommendation tasks.

Anthology ID:: 2025.emnlp-main.786
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15598–15610
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.786/
DOI:
Bibkey:
Cite (ACL):: Sashuai Zhou, Weinan Gan, Qijiong Liu, Ke Lei, Jieming Zhu, Hai Huang, Yan Xia, Ruiming Tang, Zhenhua Dong, and Zhou Zhao. 2025. RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 15598–15610, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation (Zhou et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.786.pdf
Checklist:: 2025.emnlp-main.786.checklist.pdf

PDF Cite Search Checklist Fix data