Continuous Relational Diffusion Driven Topic Model with Multi-grained Text for Microblog

Chenhao Wu, Ruifang He, Chang Liu, Bo Wang


Abstract
Topic model is a statistical model that leverages unsupervised learning to mine hidden topics in document collections. The data sparsity and colloquialism of social texts make it difficult to accurately mine the topics. Traditional methods assume that there are only 0/1-state relationships between the two parties in the social networks, but the relationship status in real life is more complicated, such as continuously changing relationships with different degrees of intimacy. This paper proposes a continuous relational diffusion driven topic model (CRTM) with multi-grained text for microblog to realize the continuous representation of the relationship state and make up for the context and structural information lost by previous representation methods. Multi-grained text representation learning distinguishes the impact of formal and informal expression on the topics further and alleviates colloquialism problems. Specifically, based on the original social network, the reconstructed social network with continuous relationship status is obtained by using information diffusion technology. The graph convolution model is utilized to learn node embeddings through the new social network. Finally, the neural variational inference is applied to generate topics according to continuous relationships. We validate CRTM on three real datasets, and the experimental results show the effectiveness of the scheme.
Anthology ID:
2024.lrec-main.345
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
3897–3906
Language:
URL:
https://aclanthology.org/2024.lrec-main.345
DOI:
Bibkey:
Cite (ACL):
Chenhao Wu, Ruifang He, Chang Liu, and Bo Wang. 2024. Continuous Relational Diffusion Driven Topic Model with Multi-grained Text for Microblog. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3897–3906, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Continuous Relational Diffusion Driven Topic Model with Multi-grained Text for Microblog (Wu et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2024.lrec-main.345.pdf
Optional supplementary material:
 2024.lrec-main.345.OptionalSupplementaryMaterial.rar