FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short Text

Zijian Zheng, Yonghe Lu, Jian Yin


Abstract
Short texts pose significant challenges for clustering due to semantic sparsity, limited context, and fuzzy category boundaries. Although recent contrastive learning methods improve instance-level representation, they often overlook local semantic structure within the clustering head. Moreover, treating semantically similar neighbors as negatives impair cluster-level discrimination. To address these issues, we propose Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering (FNSCC) framework. FNSCC incorporates neighborhood information at both the instance-level and cluster-level. At the instance-level, it excludes neighbors from the negative sample set to enhance inter-cluster separability. At the cluster-level, it introduces fuzzy neighborhood-aware weighting to refine soft assignment probabilities, encouraging alignment with semantically coherent clusters. Experiments on multiple benchmark short text datasets demonstrate that FNSCC consistently outperforms state-of-the-art models in accuracy and normalized mutual information. Our code is available at https://github.com/zjzone/FNSCC.
Anthology ID:
2025.findings-emnlp.154
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2831–2846
Language:
URL:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.154/
DOI:
10.18653/v1/2025.findings-emnlp.154
Bibkey:
Cite (ACL):
Zijian Zheng, Yonghe Lu, and Jian Yin. 2025. FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short Text. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 2831–2846, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short Text (Zheng et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.154.pdf
Checklist:
 2025.findings-emnlp.154.checklist.pdf