Hatevolution: What Static Benchmarks Don’t Tell Us

Chiara Di Bonaventura, Barbara McGillivray, Yulan He, Albert Meroño-Peñuela


Abstract
Language changes over time, including in the hate speech domain, which evolves quickly following social dynamics and cultural shifts. While NLP research has investigated the impact of language evolution on model training and has proposed several solutions for it, its impact on model benchmarking remains under-explored. Yet, hate speech benchmarks play a crucial role to ensure model safety. In this paper, we empirically evaluate the robustness of 20 language models across two evolving hate speech experiments, and we show the temporal misalignment between static and time-sensitive evaluations. Our findings call for time-sensitive linguistic benchmarks in order to correctly and reliably evaluate language models in the hate speech domain.
Anthology ID:
2025.findings-acl.910
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17695–17707
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.910/
DOI:
Bibkey:
Cite (ACL):
Chiara Di Bonaventura, Barbara McGillivray, Yulan He, and Albert Meroño-Peñuela. 2025. Hatevolution: What Static Benchmarks Don’t Tell Us. In Findings of the Association for Computational Linguistics: ACL 2025, pages 17695–17707, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Hatevolution: What Static Benchmarks Don’t Tell Us (Di Bonaventura et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.910.pdf