Sayani Basak

2025

Cyberbullying (CB) involves complex relational dynamics that are often oversimplified as a binary classification task. Existing youth-focused CB datasets rely on scripted role-play, lacking conversational realism and ethical youth involvement, with little or no evaluation of their social plausibility. To address this, we introduce a youth-in-the-loop dataset “BullyBench” developed by adolescents (ages 15–16) through an ethical co-research framework. We introduce a structured intrinsic quality evaluation with experts-in-the-loop (social scientists, psychologists, and content moderators) for assessing realism, relevance, and coherence in youth CB data. Additionally, we perform extrinsic baseline evaluation of this dataset by benchmarking encoder- and decoder-only language models for multi-class CB role classification for future research. A three-stage annotation process by young adults refines the dataset into a gold-standard test benchmark, a high-quality resource grounded in minors’ lived experiences of CB detection. Code and data are available for review

Co-authors

Darragh Mccashin 1

Tijana Milosevic 1

James Ohiggins Norman 1

Alexandros Poulis 1

Rebecca Umbach Umbach 1

Kanishk Verma 1

Joachim Wagner 1

Isobel Walsh@dcu 1

Venues

emnlp1

Fix author