FactAlign: Fact-Level Hallucination Detection and Classification Through Knowledge Graph Alignment

Mohamed Rashad, Ahmed Zahran, Abanoub Amin, Amr Abdelaal, Mohamed Altantawy


Abstract
This paper proposes a novel black-box approach for fact-level hallucination detection and classification by transforming the problem into a knowledge graph alignment task. This approach allows us to classify detected hallucinations as either intrinsic or extrinsic. The paper starts by discussing the field of hallucination detection and introducing several approaches to related work. Then, we introduce the proposed FactAlign approach for hallucination detection and discuss how we can use it to classify hallucinations as either intrinsic or extrinsic. Experiments are carried out to evaluate the proposed method against state-of-the-art methods on the hallucination detection task using the WikiBio GPT-3 hallucination dataset, and on the hallucination type classification task using the XSum hallucination annotations dataset. The experimental results show that our method achieves a 0.889 F1 score for the hallucination detection and 0.825 F1 for the hallucination type classification, without any further training, fine-tuning, or producing multiple samples of the LLM response.
Anthology ID:
2024.trustnlp-1.8
Volume:
Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kai-Wei Chang, Anaelia Ovalle, Jieyu Zhao, Yang Trista Cao, Ninareh Mehrabi, Aram Galstyan, Jwala Dhamala, Anoop Kumar, Rahul Gupta
Venues:
TrustNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
79–84
Language:
URL:
https://aclanthology.org/2024.trustnlp-1.8
DOI:
Bibkey:
Cite (ACL):
Mohamed Rashad, Ahmed Zahran, Abanoub Amin, Amr Abdelaal, and Mohamed Altantawy. 2024. FactAlign: Fact-Level Hallucination Detection and Classification Through Knowledge Graph Alignment. In Proceedings of the 4th Workshop on Trustworthy Natural Language Processing (TrustNLP 2024), pages 79–84, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
FactAlign: Fact-Level Hallucination Detection and Classification Through Knowledge Graph Alignment (Rashad et al., TrustNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.trustnlp-1.8.pdf
Supplementary material:
 2024.trustnlp-1.8.SupplementaryMaterial.zip