Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks

Huajian Zhang; Yumo Xu; Laura Perez-Beltrachini

Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks

Huajian Zhang, Yumo Xu, Laura Perez-Beltrachini

Abstract

We study existing approaches to leverage off-the-shelf Natural Language Inference (NLI) models for the evaluation of summary faithfulness and argue that these are sub-optimal due to the granularity level considered for premises and hypotheses. That is, the smaller content unit considered as hypothesis is a sentence and premises are made up of a fixed number of document sentences. We propose a novel approach, namely INFUSE, that uses a variable premise size and simplifies summary sentences into shorter hypotheses. Departing from previous studies which focus on single short document summarisation, we analyse NLI based faithfulness evaluation for diverse summarisation tasks. We introduce DiverSumm, a new benchmark comprising long form summarisation (long documents and summaries) and diverse summarisation tasks (e.g., meeting and multi-document summarisation). In experiments, INFUSE obtains superior performance across the different summarisation tasks.

Anthology ID:: 2024.eacl-long.102
Volume:: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1701–1722
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2024.eacl-long.102/
DOI:
Bibkey:
Cite (ACL):: Huajian Zhang, Yumo Xu, and Laura Perez-Beltrachini. 2024. Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1701–1722, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks (Zhang et al., EACL 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2024.eacl-long.102.pdf
Video:: https://preview.aclanthology.org/fix-sig-urls/2024.eacl-long.102.mp4

PDF Cite Search Video Fix data