Swushroomsia at SemEval-2025 Task 3: Probing LLMs’ Collective Intelligence for Multilingual Hallucination Detection
Sandra Mitrović, Joseph Cornelius, David Kletz, Ljiljana Dolamic, Fabio Rinaldi
Abstract
This paper introduces a system designed for SemEval-2025 Task 3: Mu-SHROOM, which focuses on detecting hallucinations in multilingual outputs generated by large language models (LLMs). Our approach leverages the collective intelligence of multiple LLMs by prompting several models with three distinct prompts to annotate hallucinations. These individual annotations are then merged to create a comprehensive probabilistic annotation. The proposed system demonstrates strong performance, achieving high accuracy in span detection and strong correlation between predicted probabilities and ground truth annotations.- Anthology ID:
- 2025.semeval-1.239
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1810–1827
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.239/
- DOI:
- Cite (ACL):
- Sandra Mitrović, Joseph Cornelius, David Kletz, Ljiljana Dolamic, and Fabio Rinaldi. 2025. Swushroomsia at SemEval-2025 Task 3: Probing LLMs’ Collective Intelligence for Multilingual Hallucination Detection. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1810–1827, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Swushroomsia at SemEval-2025 Task 3: Probing LLMs’ Collective Intelligence for Multilingual Hallucination Detection (Mitrović et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.239.pdf