Joemon M. Jose
2025
SOLAR: Serendipity Optimized Language Model Aligned for Recommendation
Zichen Yuan
|
Lifan Sun
|
Yucen Zhuang
|
Yue Wang
|
Xinyuan Song
|
Tianqi Xu
|
Siyuan Li
|
Junchen Fu
|
Youhua Li
|
Sirui Hong
|
Jiaqi Chen
|
Joemon M. Jose
|
Yongxin Ni
Findings of the Association for Computational Linguistics: EMNLP 2025
Recently, Large Language Models (LLMs) have shown strong potential in recommendation tasks due to their broad world knowledge and reasoning capabilities. However, applying them to serendipity-oriented recommendation remains challenging, mainly due to a domain gap of LLMs in modeling personalized user behavior and the scarcity of labeled serendipitous interactions. In this paper, we introduce **SOLAR** (**S**erendipity-**O**ptimized **L**anguage model **A**ligned for **R**ecommendation), a two-stage framework that addresses these challenges. To alleviate label scarcity, we adopt a weak supervision strategy: a sequential ID-based recommender generates candidate items, which are then reranked by an LLM acting as a preference judge to produce serendipity-aware pseudo-labels. To bridge the domain gap, we propose a domain-adaptive instruction tuning method (SUN) that aligns LLMs with recommendation tasks. Experiments on three real-world datasets show that **SOLAR** consistently improves both accuracy and serendipity over strong baselines, showing its effectiveness in enabling more diverse, user-centric recommendations. Code and dataset are released at [https://github.com/SOLAR2025ARR/SOLAR](https://github.com/SOLAR2025ARR/SOLAR).
2018
Batch IS NOT Heavy: Learning Word Representations From All Samples
Xin Xin
|
Fajie Yuan
|
Xiangnan He
|
Joemon M. Jose
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Stochastic Gradient Descent (SGD) with negative sampling is the most prevalent approach to learn word representations. However, it is known that sampling methods are biased especially when the sampling distribution deviates from the true data distribution. Besides, SGD suffers from dramatic fluctuation due to the one-sample learning scheme. In this work, we propose AllVec that uses batch gradient learning to generate word representations from all training samples. Remarkably, the time complexity of AllVec remains at the same level as SGD, being determined by the number of positive samples rather than all samples. We evaluate AllVec on several benchmark tasks. Experiments show that AllVec outperforms sampling-based SGD methods with comparable efficiency, especially for small training corpora.