SmurfCat at SemEval-2025 Task 3: Bridging External Knowledge and Model Uncertainty for Enhanced Hallucination Detection
Elisei Rykov, Valerii Olisov, Maksim Savkin, Artem Vazhentsev, Kseniia Titova, Alexander Panchenko, Vasily Konovalov, Julia Belikova
Abstract
The Multilingual shared-task on Hallucinations and Related Observable Overgeneration Mistakes in the SemEval-2025 competition aims to detect hallucination spans in the outputs of instruction-tuned LLMs in a multilingual context. In this paper, we address the detection of span hallucinations by applying an ensemble of approaches. In particular, we synthesized a PsiloQA dataset and fine-tuned LLM to detect hallucination spans. In addition, we combined this approach with a white-box method based on uncertainty quantification techniques. Using our combined pipeline, we achieved 3rd place in detecting span hallucinations in Arabic, Catalan, Finnish, Italian, and ranked within the top ten for the rest of the languages.- Anthology ID:
- 2025.semeval-1.137
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1034–1045
- Language:
- URL:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.137/
- DOI:
- Cite (ACL):
- Elisei Rykov, Valerii Olisov, Maksim Savkin, Artem Vazhentsev, Kseniia Titova, Alexander Panchenko, Vasily Konovalov, and Julia Belikova. 2025. SmurfCat at SemEval-2025 Task 3: Bridging External Knowledge and Model Uncertainty for Enhanced Hallucination Detection. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1034–1045, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- SmurfCat at SemEval-2025 Task 3: Bridging External Knowledge and Model Uncertainty for Enhanced Hallucination Detection (Rykov et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.137.pdf