Improving Topic Quality by Promoting Named Entities in Topic Modeling

Katsiaryna Krasnashchok, Salim Jouili


Abstract
News related content has been extensively studied in both topic modeling research and named entity recognition. However, expressive power of named entities and their potential for improving the quality of discovered topics has not received much attention. In this paper we use named entities as domain-specific terms for news-centric content and present a new weighting model for Latent Dirichlet Allocation. Our experimental results indicate that involving more named entities in topic descriptors positively influences the overall quality of topics, improving their interpretability, specificity and diversity.
Anthology ID:
P18-2040
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
247–253
Language:
URL:
https://aclanthology.org/P18-2040
DOI:
10.18653/v1/P18-2040
Bibkey:
Cite (ACL):
Katsiaryna Krasnashchok and Salim Jouili. 2018. Improving Topic Quality by Promoting Named Entities in Topic Modeling. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 247–253, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Improving Topic Quality by Promoting Named Entities in Topic Modeling (Krasnashchok & Jouili, ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/P18-2040.pdf
Poster:
 P18-2040.Poster.pdf