Deloitte (Drocks) at SemEval-2025 Task 3: Fine-Grained Multi-lingual Hallucination Detection Using Internal LLM Weights
Alex Chandler, Harika Abburi, Sanmitra Bhattacharya, Edward Bowen, Nirmala Pudota
Abstract
Large Language Models (LLMs) have greatly advanced the field of Natural Language Generation (NLG). Despite their remarkable capabilities, their tendency to hallucinate—producing inaccurate or misleading information-remains a barrier to wider adoption. Current hallucination detection methods mainly employ coarse-grained binary classification at the sentence or document level, overlooking the need for precise identification of the specific text spans containing hallucinations. In this paper, we proposed a methodology that generates supplementary context and processes text using an LLM to extract internal weights (features) from various layers. These extracted features serve as input for a neural network classifier designed to perform token-level binary detection of hallucinations. Subsequently, we map the resulting token-level predictions to character-level predictions, enabling the identification of spans of hallucinated text, which we refer to as hallucination spans. Our model achieved a top-ten ranking in 13 of the 14 languages and secured first place for the French language in the SemEval: Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes (Mu-SHROOM), utilizing the Mu-SHROOM dataset provided by the task organizers.- Anthology ID:
- 2025.semeval-1.144
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1089–1097
- Language:
- URL:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.144/
- DOI:
- Cite (ACL):
- Alex Chandler, Harika Abburi, Sanmitra Bhattacharya, Edward Bowen, and Nirmala Pudota. 2025. Deloitte (Drocks) at SemEval-2025 Task 3: Fine-Grained Multi-lingual Hallucination Detection Using Internal LLM Weights. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1089–1097, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Deloitte (Drocks) at SemEval-2025 Task 3: Fine-Grained Multi-lingual Hallucination Detection Using Internal LLM Weights (Chandler et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.144.pdf