Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

Qi Jia, Siyu Ren, Yizhu Liu, Kenny Zhu


Abstract
Despite tremendous improvements in natural language generation, summarization models still suffer from the unfaithfulness issue. Previous work evaluates faithfulness either using models trained on the other tasks or in-domain synthetic data, or prompting a large model such as ChatGPT. This paper proposes to do zero-shot faithfulness evaluation simply with a moderately-sized foundation language model. We introduce a new metric FFLM, which is a combination of probability changes based on the intuition that prefixing a piece of text that is consistent with the output will increase the probability of predicting the output. Experiments show that FFLM performs competitively with or even outperforms ChatGPT on both inconsistency detection and faithfulness rating with 24x fewer parameters. FFLM also achieves improvements over other strong baselines.
Anthology ID:
2023.emnlp-main.679
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11017–11031
Language:
URL:
https://aclanthology.org/2023.emnlp-main.679
DOI:
10.18653/v1/2023.emnlp-main.679
Bibkey:
Cite (ACL):
Qi Jia, Siyu Ren, Yizhu Liu, and Kenny Zhu. 2023. Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 11017–11031, Singapore. Association for Computational Linguistics.
Cite (Informal):
Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model (Jia et al., EMNLP 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.emnlp-main.679.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-5/2023.emnlp-main.679.mp4