Team Cantharellus at SemEval-2025 Task 3: Hallucination Span Detection with Fine Tuning on Weakly Supervised Synthetic Data

Xinyuan Mo, Nikolay Vorontsov, Tiankai Zang


Abstract
This paper describes our submission to SemEval-2025 Task-3: Mu-SHROOM, the Multilingual Shared-task on Hallucinations and Related Observable Overgeneration Mistakes, which mainly aims at detecting spans of LLM-generated text corresponding to hallucinations in multilingual and multi-model context. We explored an approach of fine-tuning pretrained language models available on Hugging Face. The results show that predictions made by a pretrained model fine-tuned on synthetic data achieve a relatively high degree of alignment with human-generated labels. We participated in 13 out of 14 available languages and reached an average ranking of 10th out of 41 participating teams, with our highest ranking reaching the top 5 place.
Anthology ID:
2025.semeval-1.226
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1724–1736
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.226/
DOI:
Bibkey:
Cite (ACL):
Xinyuan Mo, Nikolay Vorontsov, and Tiankai Zang. 2025. Team Cantharellus at SemEval-2025 Task 3: Hallucination Span Detection with Fine Tuning on Weakly Supervised Synthetic Data. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1724–1736, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Team Cantharellus at SemEval-2025 Task 3: Hallucination Span Detection with Fine Tuning on Weakly Supervised Synthetic Data (Mo et al., SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.226.pdf