@inproceedings{lee-etal-2024-unisumeval,
    title = "{U}ni{S}um{E}val: Towards Unified, Fine-grained, Multi-dimensional Summarization Evaluation for {LLM}s",
    author = "Lee, Yuho  and
      Yun, Taewon  and
      Cai, Jason  and
      Su, Hang  and
      Song, Hwanjun",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.227/",
    doi = "10.18653/v1/2024.findings-emnlp.227",
    pages = "3941--3960",
    abstract = "Existing benchmarks for summarization quality evaluation often lack diverse input scenarios, focus on narrowly defined dimensions (e.g., faithfulness), and struggle with subjective and coarse-grained annotation schemes. To address these shortcomings, we create UniSumEval benchmark, which extends the range of input context (e.g., domain, length) and provides fine-grained, multi-dimensional annotations. We use AI assistance in data creation, identifying potentially hallucinogenic input texts, and also helping human annotators reduce the difficulty of fine-grained annotation tasks. With UniSumEval, we benchmark nine latest language models as summarizers, offering insights into their performance across varying input contexts and evaluation dimensions. Furthermore, we conduct a thorough comparison of SOTA automated summary evaluators. Our benchmark data will be available at https://github.com/DISL-Lab/UniSumEval-v1.0."
}Markdown (Informal)
[UniSumEval: Towards Unified, Fine-grained, Multi-dimensional Summarization Evaluation for LLMs](https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.227/) (Lee et al., Findings 2024)
ACL
- Yuho Lee, Taewon Yun, Jason Cai, Hang Su, and Hwanjun Song. 2024. UniSumEval: Towards Unified, Fine-grained, Multi-dimensional Summarization Evaluation for LLMs. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 3941–3960, Miami, Florida, USA. Association for Computational Linguistics.