Dhruv Mehra


2023

pdf
EntSUMv2: Dataset, Models and Evaluation for More Abstractive Entity-Centric Summarization
Dhruv Mehra | Lingjue Xie | Ella Hofmann-Coyle | Mayank Kulkarni | Daniel Preotiuc-Pietro
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Entity-centric summarization is a form of controllable summarization that aims to generate a summary for a specific entity given a document. Concise summaries are valuable in various real-life applications, as they enable users to quickly grasp the main points of the document focusing on an entity of interest. This paper presents ENTSUMV2, a more abstractive version of the original entity-centric ENTSUM summarization dataset. In ENTSUMV2 the annotated summaries are intentionally made shorter to benefit more specific and useful entity-centric summaries for downstream users. We conduct extensive experiments on this dataset using multiple abstractive summarization approaches that employ supervised fine-tuning or large-scale instruction tuning. Additionally, we perform comprehensive human evaluation that incorporates metrics for measuring crucial facets. These metrics provide a more fine-grained interpretation of the current state-of-the-art systems and highlight areas for future improvement.