Abstract
We present our novel, hyperparameter-free topic modelling algorithm, Community Topic. Our algorithm is based on mining communities from term co-occurrence networks. We empirically evaluate and compare Community Topic with Latent Dirichlet Allocation and the recently developed top2vec algorithm. We find that Community Topic runs faster than the competitors and produces topics that achieve higher coherence scores. Community Topic can discover coherent topics at various scales. The network representation used by Community Topic results in a natural relationship between topics and a topic hierarchy. This allows sub- and super-topics to be found on demand. These features make Community Topic the ideal tool for downstream applications such as applied research and conversational agents.- Anthology ID:
- 2022.coling-1.81
- Volume:
- Proceedings of the 29th International Conference on Computational Linguistics
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na
- Venue:
- COLING
- SIG:
- Publisher:
- International Committee on Computational Linguistics
- Note:
- Pages:
- 971–983
- Language:
- URL:
- https://aclanthology.org/2022.coling-1.81
- DOI:
- Cite (ACL):
- Eric Austin, Osmar R. Zaïane, and Christine Largeron. 2022. Community Topic: Topic Model Inference by Consecutive Word Community Discovery. In Proceedings of the 29th International Conference on Computational Linguistics, pages 971–983, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
- Cite (Informal):
- Community Topic: Topic Model Inference by Consecutive Word Community Discovery (Austin et al., COLING 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.coling-1.81.pdf