PromptStream: Self-Supervised News Story Discovery Using Topic-Aware Article Representations

Arezoo Hatefi, Anton Eklund, Mona Forsman


Abstract
Given the importance of identifying and monitoring news stories within the continuous flow of news articles, this paper presents PromptStream, a novel method for unsupervised news story discovery. In order to identify coherent and comprehensive stories across the stream, it is crucial to create article representations that incorporate as much topic-related information from the articles as possible. PromptStream constructs these article embeddings using cloze-style prompting. These representations continually adjust to the evolving context of the news stream through self-supervised learning, employing a contrastive loss and a memory of the most confident article-story assignments from the most recent days. Extensive experiments with real news datasets highlight the notable performance of our model, establishing a new state of the art. Additionally, we delve into selected news stories to reveal how the model’s structuring of the article stream aligns with story progression.
Anthology ID:
2024.lrec-main.1157
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
13222–13232
Language:
URL:
https://aclanthology.org/2024.lrec-main.1157
DOI:
Bibkey:
Cite (ACL):
Arezoo Hatefi, Anton Eklund, and Mona Forsman. 2024. PromptStream: Self-Supervised News Story Discovery Using Topic-Aware Article Representations. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 13222–13232, Torino, Italia. ELRA and ICCL.
Cite (Informal):
PromptStream: Self-Supervised News Story Discovery Using Topic-Aware Article Representations (Hatefi et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2024.lrec-main.1157.pdf