RaggedyFive at SemEval-2025 Task 3: Hallucination Span Detection Using Unverifiable Answer Detection
Wessel Heerema, Collin Krooneman, Simon Van Loon, Jelmer Top, Maurice Voors
Abstract
Despite their broad utility, large language models (LLMs) are prone to hallucinations. The deviation from provided source inputs or disparateness with factual accuracy makes users question the reliability of LLMs. Therefore, detection systems for LLMs on hallucination are imperative. The system described in this paper detects hallucinated text spans by combining Retrieval-Augmented Generation (RAG) with Natural Language Interface (NLI). While zero-context handling of the RAG had little measurable effect, incorporating the RAG into a natural-language premise for the NLI yielded a noticeable improvement. Discrepancies can be attributed to labeling methodology and the implementation of the RAG.- Anthology ID:
- 2025.semeval-1.194
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1473–1478
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.194/
- DOI:
- Cite (ACL):
- Wessel Heerema, Collin Krooneman, Simon Van Loon, Jelmer Top, and Maurice Voors. 2025. RaggedyFive at SemEval-2025 Task 3: Hallucination Span Detection Using Unverifiable Answer Detection. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1473–1478, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- RaggedyFive at SemEval-2025 Task 3: Hallucination Span Detection Using Unverifiable Answer Detection (Heerema et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.194.pdf