ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation

Gibson Nkhata, Uttamasha Anjally Oyshi, Quan Mai, Susan Gauch


Abstract
Personalized Retrieval-Augmented Generation (RAG) relies on accurately selecting user-relevant documents. In practice, existing RAG approaches often suffer from high retrieval costs and overlook that collaborative signals from similar users can enhance personalized generation for the current user. We propose ClusterRAG, a Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation. ClusterRAG represents users through their profile documents, organizes users into semantically coherent clusters using density-based clustering, and performs retrieval at both the cluster and document levels via cluster-level similarity and fine-grained ranking. Extensive experiments on the LaMP benchmark demonstrate that jointly leveraging the target user’s profile and profiles from top similar users consistently yields the best performance across diverse tasks. Further analysis shows that ClusterRAG integrates seamlessly with different dense retrievers and rankers, and remains effective when paired with both fine-tuned and zero-shot language models.
Anthology ID:
2026.acl-long.940
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20523–20539
Language:
URL:
https://preview.aclanthology.org/check-for-anonymous-pdfs/2026.acl-long.940/
DOI:
Bibkey:
Cite (ACL):
Gibson Nkhata, Uttamasha Anjally Oyshi, Quan Mai, and Susan Gauch. 2026. ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 20523–20539, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation (Nkhata et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/check-for-anonymous-pdfs/2026.acl-long.940.pdf
Checklist:
 2026.acl-long.940.checklist.pdf