NootNoot At SemEval-2024 Task 6: Hallucinations and Related Observable Overgeneration Mistakes Detection

Sankalp Bahad, Yash Bhaskar, Parameswari Krishnamurthy


Abstract
Semantic hallucinations in neural language gen-eration systems pose a significant challenge tothe reliability and accuracy of natural languageprocessing applications. Current neural mod-els often produce fluent but incorrect outputs,undermining the usefulness of generated text.In this study, we address the task of detectingsemantic hallucinations through the SHROOM(Semantic Hallucinations Real Or Mistakes)dataset, encompassing data from diverse NLGtasks such as definition modeling, machinetranslation, and paraphrase generation. We in-vestigate three methodologies: fine-tuning onlabelled training data, fine-tuning on labelledvalidation data, and a zero-shot approach usingthe Mixtral 8x7b instruct model. Our resultsdemonstrate the effectiveness of these method-ologies in identifying semantic hallucinations,with the zero-shot approach showing compet-itive performance without additional training.Our findings highlight the importance of robustdetection mechanisms for ensuring the accu-racy and reliability of neural language genera-tion systems.
Anthology ID:
2024.semeval-1.139
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
964–968
Language:
URL:
https://aclanthology.org/2024.semeval-1.139
DOI:
10.18653/v1/2024.semeval-1.139
Bibkey:
Cite (ACL):
Sankalp Bahad, Yash Bhaskar, and Parameswari Krishnamurthy. 2024. NootNoot At SemEval-2024 Task 6: Hallucinations and Related Observable Overgeneration Mistakes Detection. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 964–968, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
NootNoot At SemEval-2024 Task 6: Hallucinations and Related Observable Overgeneration Mistakes Detection (Bahad et al., SemEval 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.semeval-1.139.pdf
Supplementary material:
 2024.semeval-1.139.SupplementaryMaterial.txt