SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection

Bradley Allen; Fina Polat; Paul Groth

SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection

Abstract

We describe the University of Amsterdam Intelligent Data Engineering Lab team’s entry for the SemEval-2024 Task 6 competition. The SHROOM-INDElab system builds on previous work on using prompt programming and in-context learning with large language models (LLMs) to build classifiers for hallucination detection, and extends that work through the incorporation of context-specific definition of task, role, and target concept, and automated generation of examples for use in a few-shot prompting approach. The resulting system achieved fourth-best and sixth-best performance in the model-agnostic track and model-aware tracks for Task 6, respectively, and evaluation using the validation sets showed that the system’s classification decisions were consistent with those of the crowdsourced human labelers. We further found that a zero-shot approach provided better accuracy than a few-shot approach using automatically generated examples. Code for the system described in this paper is available on Github.

Anthology ID:: 2024.semeval-1.120
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 839–844
Language:
URL:: https://aclanthology.org/2024.semeval-1.120
DOI:
Bibkey:
Cite (ACL):: Bradley Allen, Fina Polat, and Paul Groth. 2024. SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 839–844, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: SHROOM-INDElab at SemEval-2024 Task 6: Zero- and Few-Shot LLM-Based Classification for Hallucination Detection (Allen et al., SemEval 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.semeval-1.120.pdf
Supplementary material:: 2024.semeval-1.120.SupplementaryMaterial.zip
Supplementary material:: 2024.semeval-1.120.SupplementaryMaterial.txt

PDF Search Supplementary material Supplementary material