Howard University - AI4PC at SemEval-2025 Task 3: Logit-based Supervised Token Classification for Multilingual Hallucination Span Identification Using XGBOD

Saurav Aryal, Mildness Akomoize


Abstract
This paper describes our system for SemEval-2025 Task 3, Mu-SHROOM, which focuses on detecting hallucination spans in multilingual LLM outputs. We reframe hallucination detection as a point-wise anomaly detection problem by treating logits as time-series data. Our approach extracts features from token-level logits, addresses class imbalance with SMOTE, and trains an XGBOD model for probabilistic character-level predictions. Our system, which relies solely on information derived from the logits and token offsets (using pretrained tokenizers), achieves competitive intersection-over-union (IoU) and correlation scores on the validation and test set.
Anthology ID:
2025.semeval-1.236
Volume:
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1790–1794
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.236/
DOI:
Bibkey:
Cite (ACL):
Saurav Aryal and Mildness Akomoize. 2025. Howard University - AI4PC at SemEval-2025 Task 3: Logit-based Supervised Token Classification for Multilingual Hallucination Span Identification Using XGBOD. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1790–1794, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Howard University - AI4PC at SemEval-2025 Task 3: Logit-based Supervised Token Classification for Multilingual Hallucination Span Identification Using XGBOD (Aryal & Akomoize, SemEval 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.semeval-1.236.pdf