Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study

Alba María Mármol-Romero, Robiert Sepúlveda-Torres, Estela Saquete, María-Teresa Martín-Valdivia, L. Alfonso Ureña


Abstract
The rise of toxic content on digital platforms has intensified the demand for automatic moderation tools. While English has benefited from large-scale annotated corpora, Spanish remains under-resourced, particularly for nuanced cases of toxicity such as irony, sarcasm, or indirect aggression. We present an extended version of the NECOS-TOX corpus, comprising 4,011 Spanish comments collected from 16 major news outlets. Each comment is annotated across three levels of toxicity (Non-Toxic, Slightly Toxic, and Toxic), following an iterative annotation protocol that achieved substantial inter-annotator agreement (k = 0.74). To reduce annotation costs while maintaining quality, we employed a human-in-the-loop active learning strategy, with manual correction of model pre-labels. We benchmarked the dataset with traditional machine learning (ML) methods, domain-specific transformers, and instruction-tuned large language models (LLMs). Results show that compact encoder models (e.g., RoBERTa-base-bne, 125M parameters) perform on par with much larger models (e.g., LLaMA-3.1-8B), underscoring the value of in-domain adaptation over raw scale. Our error analysis highlights persistent challenges in distinguishing subtle forms of toxicity, especially sarcasm and implicit insults, and reveals entity-related biases that motivate anonymization strategies. The dataset and trained models are released publicly.
Anthology ID:
2026.findings-eacl.100
Volume:
Findings of the Association for Computational Linguistics: EACL 2026
Month:
March
Year:
2026
Address:
Rabat, Morocco
Editors:
Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1940–1954
Language:
URL:
https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.100/
DOI:
Bibkey:
Cite (ACL):
Alba María Mármol-Romero, Robiert Sepúlveda-Torres, Estela Saquete, María-Teresa Martín-Valdivia, and L. Alfonso Ureña. 2026. Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study. In Findings of the Association for Computational Linguistics: EACL 2026, pages 1940–1954, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):
Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study (Mármol-Romero et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.100.pdf
Checklist:
 2026.findings-eacl.100.checklist.pdf