2025
pdf
bib
abs
FoodSafeSum: Enabling Natural Language Processing Applications for Food Safety Document Summarization and Analysis
Juli Bakagianni
|
Korbinian Randl
|
Guido Rocchietti
|
Cosimo Rulli
|
Franco Maria Nardini
|
Salvatore Trani
|
Aron Henriksson
|
Anna Romanova
|
John Pavlopoulos
Findings of the Association for Computational Linguistics: EMNLP 2025
Food safety demands timely detection, regulation, and public communication, yet the lack of structured datasets hinders Natural Language Processing (NLP) research. We present and release a new dataset of human-written and Large Language Model (LLM)-generated summaries of food safety documents, plus food safety related metadata. We evaluate its utility on three NLP tasks directly reflecting food safety practices: multilabel classification for organizing documents into domain-specific categories; document retrieval for accessing regulatory and scientific evidence; and question answering via retrieval-augmented generation that improves factual accuracy.We show that LLM summaries perform comparably or better than human ones across tasks. We also demonstrate clustering of summaries for event tracking and compliance monitoring. This dataset enables NLP applications that support core food safety practices, including the organization of regulatory and scientific evidence, monitoring of compliance issues, and communication of risks to the public.
pdf
bib
abs
SemEval-2025 Task 9: The Food Hazard Detection Challenge
Korbinian Randl
|
John Pavlopoulos
|
Aron Henriksson
|
Tony Lindgren
|
Juli Bakagianni
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
In this challenge, we explored text-based food hazard prediction with long tail distributed classes. The task was divided into two subtasks: (1) predicting whether a web text implies one of ten food-hazard categories and identifying the associated food category, and (2) providing a more fine-grained classification by assigning a specific label to both the hazard and the product. Our findings highlight that large language model-generated synthetic data can be highly effective for oversampling long-tail distributions. Furthermore, we find that fine-tuned encoder-only, encoder-decoder, and decoder-only systems achieve comparable maximum performance across both subtasks. During this challenge, we are gradually releasing (under CC BY-NC-SA 4.0) a novel set of 6,644 manually labeled food-incident reports.
2024
pdf
bib
abs
CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification
Korbinian Randl
|
John Pavlopoulos
|
Aron Henriksson
|
Tony Lindgren
Findings of the Association for Computational Linguistics: ACL 2024
Contaminated or adulterated food poses a substantial risk to human health. Given sets of labeled web texts for training, Machine Learning and Natural Language Processing can be applied to automatically detect such risks. We publish a dataset of 7,546 short texts describing public food recall announcements. Each text is manually labeled, on two granularity levels (coarse and fine), for food products and hazards that the recall corresponds to. We describe the dataset and benchmark naive, traditional, and Transformer models. Based on our analysis, Logistic Regression based on a TF-IDF representation outperforms RoBERTa and XLM-R on classes with low support. Finally, we discuss different prompting strategies and present an LLM-in-the-loop framework, based on Conformal Prediction, which boosts the performance of the base classifier while reducing energy consumption compared to normal prompting.