Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization

Meng Cao, Yue Dong, Jackie Cheung


Abstract
State-of-the-art abstractive summarization systems often generate hallucinations; i.e., content that is not directly inferable from the source text. Despite being assumed to be incorrect, we find that much hallucinated content is actually consistent with world knowledge, which we call factual hallucinations. Including these factual hallucinations in a summary can be beneficial because they provide useful background information. In this work, we propose a novel detection approach that separates factual from non-factual hallucinations of entities. Our method is based on an entity’s prior and posterior probabilities according to pre-trained and finetuned masked language models, respectively. Empirical results suggest that our method vastly outperforms two baselines in both accuracy and F1 scores and has a strong correlation with human judgments on factuality classification tasks. Furthermore, we use our method as a reward signal to train a summarization system using an off-line reinforcement learning (RL) algorithm that can significantly improve the factuality of generated summaries while maintaining the level of abstractiveness.
Anthology ID:
2022.acl-long.236
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3340–3354
Language:
URL:
https://aclanthology.org/2022.acl-long.236
DOI:
10.18653/v1/2022.acl-long.236
Bibkey:
Cite (ACL):
Meng Cao, Yue Dong, and Jackie Cheung. 2022. Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3340–3354, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Hallucinated but Factual! Inspecting the Factuality of Hallucinations in Abstractive Summarization (Cao et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2022.acl-long.236.pdf
Software:
 2022.acl-long.236.software.zip
Code
 mcao516/entfa