Abstract
Topic coherence is increasingly being used to evaluate topic models and filter topics for end-user applications. Topic coherence measures how well topic words relate to each other, but offers little insight on the utility of the topics in describing the documents. In this paper, we explore the topic intrusion task — the task of guessing an outlier topic given a document and a few topics — and propose a method to automate it. We improve upon the state-of-the-art substantially, demonstrating its viability as an alternative method for topic model evaluation.- Anthology ID:
- D18-1098
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 844–849
- Language:
- URL:
- https://aclanthology.org/D18-1098
- DOI:
- 10.18653/v1/D18-1098
- Cite (ACL):
- Shraey Bhatia, Jey Han Lau, and Timothy Baldwin. 2018. Topic Intrusion for Automatic Topic Model Evaluation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 844–849, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Topic Intrusion for Automatic Topic Model Evaluation (Bhatia et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/naacl24-info/D18-1098.pdf