Abstract
This paper presents the Hallucination Recognition Model for New Experiment Evaluation (HaRMoNEE) team’s winning (#1) and #10 submissions for SemEval-2024 Task 6: Shared- task on Hallucinations and Related Observable Overgeneration Mistakes (SHROOM)’s two subtasks. This task challenged its participants to design systems to detect hallucinations in Large Language Model (LLM) outputs. Team HaRMoNEE proposes two architectures: (1) fine-tuning an off-the-shelf transformer-based model and (2) prompt tuning large-scale Large Language Models (LLMs). One submission from the fine-tuning approach outperformed all other submissions for the model-aware subtask; one submission from the prompt-tuning approach is the 10th-best submission on the leaderboard for the model-agnostic subtask. Our systems also include pre-processing, system-specific tuning, post-processing, and evaluation.- Anthology ID:
- 2024.semeval-1.191
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1322–1331
- Language:
- URL:
- https://aclanthology.org/2024.semeval-1.191
- DOI:
- 10.18653/v1/2024.semeval-1.191
- Cite (ACL):
- Timothy Obiso, Jingxuan Tu, and James Pustejovsky. 2024. HaRMoNEE at SemEval-2024 Task 6: Tuning-based Approaches to Hallucination Recognition. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1322–1331, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- HaRMoNEE at SemEval-2024 Task 6: Tuning-based Approaches to Hallucination Recognition (Obiso et al., SemEval 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.semeval-1.191.pdf