Abstract
Topic models jointly learn topics and document-level topic distribution. Extrinsic evaluation of topic models tends to focus exclusively on topic-level evaluation, e.g. by assessing the coherence of topics. We demonstrate that there can be large discrepancies between topic- and document-level model quality, and that basing model evaluation on topic-level analysis can be highly misleading. We propose a method for automatically predicting topic model quality based on analysis of document-level topic allocations, and provide empirical evidence for its robustness.- Anthology ID:
- K17-1022
- Volume:
- Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
- Month:
- August
- Year:
- 2017
- Address:
- Vancouver, Canada
- Editors:
- Roger Levy, Lucia Specia
- Venue:
- CoNLL
- SIG:
- SIGNLL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 206–215
- Language:
- URL:
- https://aclanthology.org/K17-1022
- DOI:
- 10.18653/v1/K17-1022
- Cite (ACL):
- Shraey Bhatia, Jey Han Lau, and Timothy Baldwin. 2017. An Automatic Approach for Document-level Topic Model Evaluation. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 206–215, Vancouver, Canada. Association for Computational Linguistics.
- Cite (Informal):
- An Automatic Approach for Document-level Topic Model Evaluation (Bhatia et al., CoNLL 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/K17-1022.pdf