Chunjiang Zhu

2025

pdf bib abs
Supervised Neural Topic Modeling with Label Alignment
Ruihao Chen | Hegang Chen | Yuyin Lu | Yanghui Rao | Chunjiang Zhu
Transactions of the Association for Computational Linguistics, Volume 13

Neural topic modeling is a scalable automated technique for text data mining. In various downstream tasks of topic modeling, it is preferred that the discovered topics well align with labels. However, due to the lack of guidance from labels, unsupervised neural topic models are less powerful in this situation. Existing supervised neural topic models often adopt a label-free prior to generate the latent document-topic distributions and use them to predict the labels and thus achieve label-topic alignment indirectly. Such a mechanism faces the following issues: 1) The label-free prior leads to topics blending the latent patterns of multiple labels; and 2) One is unable to intuitively identify the explicit relationships between labels and the discovered topics. To tackle these problems, we develop a novel supervised neural topic model which utilizes a chain-structured graphical model with a label-conditioned prior. Soft indicators are introduced to explicitly construct the label-topic relationships. To obtain well-organized label-topic relationships, we formalize an entropy-regularized optimal transport problem on the embedding space and model them as the transport plan. Moreover, our proposed method can be flexibly integrated with most existing unsupervised neural topic models. Experimental results on multiple datasets demonstrate that our model can greatly enhance the alignment between labels and topics while maintaining good topic quality.

2024

pdf bib
Multi-Granularity History and Entity Similarity Learning for Temporal Knowledge Graph Reasoning
Shi Mingcong | Chunjiang Zhu | Detian Zhang | Shiting Wen | Li Qing
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

pdf bib abs
Unsupervised Hierarchical Topic Modeling via Anchor Word Clustering and Path Guidance
Jiyuan Liu | Hegang Chen | Chunjiang Zhu | Yanghui Rao
Findings of the Association for Computational Linguistics: EMNLP 2024

Hierarchical topic models nowadays tend to capture the relationship between words and topics, often ignoring the role of anchor words that guide text generation. For the first time, we detect and add anchor words to the text generation process in an unsupervised way. Firstly, we adopt a clustering algorithm to adaptively detect anchor words that are highly consistent with every topic, which forms the path of topic → anchor word. Secondly, we add the causal path of anchor word → word to the popular Variational Auto-Encoder (VAE) framework via implicitly using word co-occurrence graphs. We develop the causal path of topic+anchor word → higher-layer topic that aids the expression of topic concepts with anchor words to capture a more semantically tight hierarchical topic structure. Finally, we enhance the model’s representation of the anchor words through a novel contrastive learning. After jointly training the aforementioned constraint objectives, we can produce more coherent and diverse topics with a better hierarchical structure. Extensive experiments on three datasets show that our model outperforms state-of-the-art methods.

Co-authors

Shi Mingcong 1

Li Qing 1

Shiting Wen 1

Detian Zhang 1

Venues

Fix data