Abstract
Detecting changes within an unfolding event in real time from news articles or social media enables to react promptly to serious issues in public safety, public health or natural disasters. In this study, we use on-line Latent Dirichlet Allocation (LDA) to model shifts in topics, and apply on-line change point detection (CPD) algorithms to detect when significant changes happen. We describe an on-line Bayesian change point detection algorithm that we use to detect topic changes from on-line LDA output. Extensive experiments on social media data and news articles show the benefits of on-line LDA versus standard LDA, and of on-line change point detection compared to off-line algorithms. This yields F-scores up to 52% on the detection of significant real-life changes from these document streams.- Anthology ID:
- C18-1212
- Volume:
- Proceedings of the 27th International Conference on Computational Linguistics
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Emily M. Bender, Leon Derczynski, Pierre Isabelle
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2505–2515
- Language:
- URL:
- https://aclanthology.org/C18-1212
- DOI:
- Cite (ACL):
- Yunli Wang and Cyril Goutte. 2018. Real-time Change Point Detection using On-line Topic Models. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2505–2515, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- Real-time Change Point Detection using On-line Topic Models (Wang & Goutte, COLING 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/C18-1212.pdf