Leveraging Meta Information in Short Text Aggregation

He Zhao, Lan Du, Guanfeng Liu, Wray Buntine


Abstract
Short texts such as tweets often contain insufficient word co-occurrence information for training conventional topic models. To deal with the insufficiency, we propose a generative model that aggregates short texts into clusters by leveraging the associated meta information. Our model can generate more interpretable topics as well as document clusters. We develop an effective Gibbs sampling algorithm favoured by the fully local conjugacy in the model. Extensive experiments demonstrate that our model achieves better performance in terms of document clustering and topic coherence.
Anthology ID:
P19-1396
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4042–4049
Language:
URL:
https://aclanthology.org/P19-1396
DOI:
10.18653/v1/P19-1396
Bibkey:
Cite (ACL):
He Zhao, Lan Du, Guanfeng Liu, and Wray Buntine. 2019. Leveraging Meta Information in Short Text Aggregation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4042–4049, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Leveraging Meta Information in Short Text Aggregation (Zhao et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/P19-1396.pdf