Yonghe Lu

Also published as: 永和


2025

pdf bib
FNSCC: Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering for Short Text
Zijian Zheng | Yonghe Lu | Jian Yin
Findings of the Association for Computational Linguistics: EMNLP 2025

Short texts pose significant challenges for clustering due to semantic sparsity, limited context, and fuzzy category boundaries. Although recent contrastive learning methods improve instance-level representation, they often overlook local semantic structure within the clustering head. Moreover, treating semantically similar neighbors as negatives impair cluster-level discrimination. To address these issues, we propose Fuzzy Neighborhood-Aware Self-Supervised Contrastive Clustering (FNSCC) framework. FNSCC incorporates neighborhood information at both the instance-level and cluster-level. At the instance-level, it excludes neighbors from the negative sample set to enhance inter-cluster separability. At the cluster-level, it introduces fuzzy neighborhood-aware weighting to refine soft assignment probabilities, encouraging alignment with semantically coherent clusters. Experiments on multiple benchmark short text datasets demonstrate that FNSCC consistently outperforms state-of-the-art models in accuracy and normalized mutual information. Our code is available at https://github.com/zjzone/FNSCC.

2023

pdf bib
融合Synonyms 词库的专利语义相似度计算研究(Patent Semantic Similarity Calculation by Fusing Synonyms Database)
Xinyu Tong (佟昕瑀) | Jialun Liao (廖佳伦) | Yonghe Lu (路永和)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics

“一直以来,专利相似度计算和比较等工作都由专利审查员人工进行并做出准确判断。然而,以人工方式分析和研判专利的原创性、实用性以及是否侵权等工作需要投入大量的人力物力资源且效率较低。基于此,本文将ALBERT预训练模型用于专利的文本表示,并通过引入Synonyms近义词库增强专利文本的语义表达能力,探索一种基于语义知识库和深度学习的专利文本表示模型与相似度计算方法。实验结果表明,加入Synonyms近义词库消歧后的专利文本相似性度量的实验准确率有一定的提升。”