Lubos Steskal
2026
Personalizing News Headlines with Retrieval-Augmented Generation
Jiajing Wan | Samia Touileb | Lubos Steskal | Lilja Øvrelid
Proceedings of the Second Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U)
Jiajing Wan | Samia Touileb | Lubos Steskal | Lilja Øvrelid
Proceedings of the Second Workshop on Customizable NLP: Progress and Challenges in Customizing NLP for a Domain, Application, Group, or Individual (CustomNLP4U)
We focus on personalized news headline generation, where we aim to improve headline generation by extending the generation context to incorporate the news reading history of users. In particular, we study a RAG-LLM-based system that customizes news headlines with user histories to improve news headline personalization. Our experiments show that our approach not only produces better headlines for specific users, but also makes the generated headlines closer to the original headlines. We experiment with different retrievers and analyze the generated outputs through systematic comparisons with both original and rewritten headlines. These analyses provide insights into the role of retrieval and personalization in headline generation, highlighting how the user history contributes to meaningful improvement while remaining aligned with original headlines.
2024
EDEN: A Dataset for Event Detection in Norwegian News
Samia Touileb | Jeanett Murstad | Petter Mæhlum | Lubos Steskal | Lilja Charlotte Storset | Huiling You | Lilja Øvrelid
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Samia Touileb | Jeanett Murstad | Petter Mæhlum | Lubos Steskal | Lilja Charlotte Storset | Huiling You | Lilja Øvrelid
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
We present EDEN, the first Norwegian dataset annotated with event information at the sentence level, adapting the widely used ACE event schema to Norwegian. The paper describes the manual annotation of Norwegian text as well as transcribed speech in the news domain, together with inter-annotator agreement and discussions of relevant dataset statistics. We also present preliminary modeling results using a graph-based event parser. The resulting dataset will be freely available for download and use.