@inproceedings{you-etal-2025-event,
    title = "Event-based evaluation of abstractive news summarization",
    author = "You, Huiling  and
      Touileb, Samia  and
      {\O}vrelid, Lilja  and
      Velldal, Erik",
    editor = "Arviv, Ofir  and
      Clinciu, Miruna  and
      Dhole, Kaustubh  and
      Dror, Rotem  and
      Gehrmann, Sebastian  and
      Habba, Eliya  and
      Itzhak, Itay  and
      Mille, Simon  and
      Perlitz, Yotam  and
      Santus, Enrico  and
      Sedoc, Jo{\~a}o  and
      Shmueli Scheuer, Michal  and
      Stanovsky, Gabriel  and
      Tafjord, Oyvind",
    booktitle = "Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM{\texttwosuperior})",
    month = jul,
    year = "2025",
    address = "Vienna, Austria and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.gem-1.43/",
    pages = "504--510",
    ISBN = "979-8-89176-261-9",
    abstract = "An abstractive summary of a news article contains its most important information in a condensed version. The evaluation of automatically generated summaries by generative language models relies heavily on human-authored summaries as gold references, by calculating overlapping units or similarity scores. News articles report events, and ideally so should the summaries. In this work, we propose to evaluate the quality of abstractive summaries by calculating overlapping events between generated summaries, reference summaries, and the original news articles. We experiment on a richly annotated Norwegian dataset comprising both events annotations and summaries authored by expert human annotators. Our approach provides more insight into the event information contained in the summaries."
}Markdown (Informal)
[Event-based evaluation of abstractive news summarization](https://preview.aclanthology.org/ingest-emnlp/2025.gem-1.43/) (You et al., GEM 2025)
ACL
- Huiling You, Samia Touileb, Lilja Øvrelid, and Erik Velldal. 2025. Event-based evaluation of abstractive news summarization. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 504–510, Vienna, Austria and virtual meeting. Association for Computational Linguistics.