Abstract
Recent emerged phrase-level topic models are able to provide topics of phrases, which are easy to read for humans. But these models are lack of the ability to capture the correlation structure among the discovered numerous topics. We propose a novel topic model PhraseCTM and a two-stage method to find out the correlated topics at phrase level. In the first stage, we train PhraseCTM, which models the generation of words and phrases simultaneously by linking the phrases and component words within Markov Random Fields when they are semantically coherent. In the second stage, we generate the correlation of topics from PhraseCTM. We evaluate our method by a quantitative experiment and a human study, showing the correlated topic modeling on phrases is a good and practical way to interpret the underlying themes of a corpus.- Anthology ID:
- P18-2083
- Volume:
- Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Editors:
- Iryna Gurevych, Yusuke Miyao
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 521–526
- Language:
- URL:
- https://aclanthology.org/P18-2083
- DOI:
- 10.18653/v1/P18-2083
- Cite (ACL):
- Weijing Huang. 2018. PhraseCTM: Correlated Topic Modeling on Phrases within Markov Random Fields. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 521–526, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- PhraseCTM: Correlated Topic Modeling on Phrases within Markov Random Fields (Huang, ACL 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/P18-2083.pdf