Yushan Tan


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2023

pdf bib
Improved Training of Deep Text Clustering
Zonghao Yang | Wenpeng Hu | Yushan Tan | Zhunchen Luo
Findings of the Association for Computational Linguistics: EMNLP 2023

The classical deep clustering optimization methods basically leverage information such as clustering centers, mutual information, and distance metrics to construct implicit generalized labels to establish information feedback (weak supervision) and thus optimize the deep model. However, the resulting generalized labels have different degrees of errors in the whole clustering process due to the limitation of clustering accuracy, which greatly interferes with the clustering process. To this end, this paper proposes a general deep clustering optimization method from the perspective of empirical risk minimization, using the correlation relationship between the samples. Experiments on two classical deep clustering methods demonstrate the necessity and effectiveness of the method. Code is available at https://github.com/yangzonghao1024/DCGLU.