LLM-MemCluster: Empowering Large Language Models with Dynamic Memory for Text Clustering

Yuanjie Zhu, Liangwei Yang, Ke Xu, Weizhi Zhang, Zihe Song, Jindong Wang, Philip S. Yu


Abstract
Large Language Models (LLMs) are reshaping unsupervised learning by offering an unprecedented ability to perform text clustering based on their deep semantic understanding. However, their direct application is fundamentally limited by a lack of stateful memory for iterative refinement and the difficulty of managing cluster granularity. As a result, existing methods often rely on complex pipelines with external modules, sacrificing a truly end-to-end approach. We introduce LLM-MemCluster, a novel framework that reconceptualizes clustering as a fully LLM-native task. It leverages a Dynamic Memory to instill state awareness and a Dual-Prompt Strategy to enable the model to reason about and determine the number of clusters. Evaluated on several benchmark datasets, our tuning-free framework significantly and consistently outperforms strong baselines. LLM-MemCluster presents an effective, interpretable, and truly end-to-end paradigm for LLM-based text clustering.
Anthology ID:
2026.knowfm-1.7
Volume:
Proceedings of the 4th Workshop on Towards Knowledgeable Foundation Models (KnowFM 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Canyu Chen, Yuji Zhang, Zoey Sha Li, Zihan Wang, Qineng Wang, Jinyan Su, Priyanka Kargupta, Sara Vera Marjanović, Jeff Z. Pan, Mohit Bansal, Isabelle Augenstein, Jiawei Han, Heng Ji, Manling Li
Venues:
KnowFM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
90–104
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.knowfm-1.7/
DOI:
Bibkey:
Cite (ACL):
Yuanjie Zhu, Liangwei Yang, Ke Xu, Weizhi Zhang, Zihe Song, Jindong Wang, and Philip S. Yu. 2026. LLM-MemCluster: Empowering Large Language Models with Dynamic Memory for Text Clustering. In Proceedings of the 4th Workshop on Towards Knowledgeable Foundation Models (KnowFM 2026), pages 90–104, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
LLM-MemCluster: Empowering Large Language Models with Dynamic Memory for Text Clustering (Zhu et al., KnowFM 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.knowfm-1.7.pdf