SemEval-2025 Task 9: The Food Hazard Detection Challenge
Korbinian Randl, John Pavlopoulos, Aron Henriksson, Tony Lindgren, Juli Bakagianni
Abstract
In this challenge, we explored text-based food hazard prediction with long tail distributed classes. The task was divided into two subtasks: (1) predicting whether a web text implies one of ten food-hazard categories and identifying the associated food category, and (2) providing a more fine-grained classification by assigning a specific label to both the hazard and the product. Our findings highlight that large language model-generated synthetic data can be highly effective for oversampling long-tail distributions. Furthermore, we find that fine-tuned encoder-only, encoder-decoder, and decoder-only systems achieve comparable maximum performance across both subtasks. During this challenge, we are gradually releasing (under CC BY-NC-SA 4.0) a novel set of 6,644 manually labeled food-incident reports.- Anthology ID:
- 2025.semeval-1.325
- Volume:
- Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Sara Rosenthal, Aiala Rosá, Debanjan Ghosh, Marcos Zampieri
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2523–2534
- Language:
- URL:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.325/
- DOI:
- Cite (ACL):
- Korbinian Randl, John Pavlopoulos, Aron Henriksson, Tony Lindgren, and Juli Bakagianni. 2025. SemEval-2025 Task 9: The Food Hazard Detection Challenge. In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2523–2534, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- SemEval-2025 Task 9: The Food Hazard Detection Challenge (Randl et al., SemEval 2025)
- PDF:
- https://preview.aclanthology.org/transition-to-people-yaml/2025.semeval-1.325.pdf