Abstract
Short texts such as tweets often contain insufficient word co-occurrence information for training conventional topic models. To deal with the insufficiency, we propose a generative model that aggregates short texts into clusters by leveraging the associated meta information. Our model can generate more interpretable topics as well as document clusters. We develop an effective Gibbs sampling algorithm favoured by the fully local conjugacy in the model. Extensive experiments demonstrate that our model achieves better performance in terms of document clustering and topic coherence.- Anthology ID:
- P19-1396
- Volume:
- Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Anna Korhonen, David Traum, Lluís Màrquez
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4042–4049
- Language:
- URL:
- https://aclanthology.org/P19-1396
- DOI:
- 10.18653/v1/P19-1396
- Cite (ACL):
- He Zhao, Lan Du, Guanfeng Liu, and Wray Buntine. 2019. Leveraging Meta Information in Short Text Aggregation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4042–4049, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Leveraging Meta Information in Short Text Aggregation (Zhao et al., ACL 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/P19-1396.pdf