Implicit Hate Target Span Detection in Zero- and Few-Shot Settings with Selective Sub-Billion Parameter Models

Hossam Boudraa; Benoit Favre; Raquel Urena

Implicit Hate Target Span Detection in Zero- and Few-Shot Settings with Selective Sub-Billion Parameter Models

Hossam Boudraa, Benoit Favre, Raquel Urena

Abstract

This work investigates the effectiveness of masked language models (MLMs) and autoregressive language models (LLMs) with fewer than one billion parameters in the detection of implicit hate speech through fine-grained span identification. The evaluation spans zero-shot, few-shot, and full supervision settings across two core benchmarks—SBIC and IHC—and an auxiliary testbed, OffensiveLang.RoBERTa-Large-355M emerges as the strongest zero-shot model, achieving the highest F1 scores of 75.8 (SBIC) and 72.5 (IHC), outperforming larger models like LLaMA 3.2-1B. ModernBERT-125M closely matches this performance with scores of 75.1 and 72.2, demonstrating the advantage of architectural efficiency. Among instruction-tuned models, SmolLM2-135M Instruct and LLaMA 3.2 1B Instruct consistently outperform their non-instructed counterparts, with up to +2.3 F1 gain on SBIC and +1.7 on IHC. Interestingly, the larger SmolLM2-360M Instruct does not outperform the 135M variant, highlighting that model scale does not always correlate with performance in implicit hate detection tasks.Few-shot fine-tuning with SmolLM2-135M Instruct achieves F1 scores of 68.2 (SBIC) and 64.0 (IHC), trailing full-data fine-tuning by only 1.6 and 2.0 points, respectively, with accuracy drops under 0.5 points. This illustrates the promise of compact, instruction-aligned models in data-scarce settings, particularly when optimized with Low-Rank Adaptation (LoRA).Topic-guided error analysis using Latent Dirichlet Allocation (LDA) reveals recurring model failures in ideologically charged or euphemistic discourse. Misclassifications often involve neutral references to identity, politics, or advocacy language, underscoring current limitations in discourse-level inference and sociopragmatic understanding.

Anthology ID:: 2025.woah-1.21
Volume:: Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)
Month:: August
Year:: 2025
Address:: Vienna, Austria
Editors:: Agostina Calabrese, Christine de Kock, Debora Nozza, Flor Miriam Plaza-del-Arco, Zeerak Talat, Francielle Vargas
Venues:: WOAH | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 228–240
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.woah-1.21/
DOI:
Bibkey:
Cite (ACL):: Hossam Boudraa, Benoit Favre, and Raquel Urena. 2025. Implicit Hate Target Span Detection in Zero- and Few-Shot Settings with Selective Sub-Billion Parameter Models. In Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH), pages 228–240, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Implicit Hate Target Span Detection in Zero- and Few-Shot Settings with Selective Sub-Billion Parameter Models (Boudraa et al., WOAH 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.woah-1.21.pdf

PDF Cite Search Fix data