DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

Jason S Lucas, Matt Murtagh White, Ali Al-Lawati, Uchendu Uchendu, Adaku Uchendu, Dongwon Lee


Abstract
Harmful content detectors—particularly disinformation classifiers—are predominantly developed and evaluated on Standard American English (), leaving their robustness to dialectal variation unexplored. We present , the first benchmark for evaluating disinformation detection robustness across 50 English dialects spanning U.S., British, African, Caribbean, and Asia-Pacific varieties. Using Multi-VALUE’s linguistically-grounded transformations, we introduce D-CUBE (Dialectal Disinformation Detection Corpus), a core corpus component of comprising 195K samples derived from established disinformation benchmarks. Our evaluation of 16 detection models reveals systematic vulnerabilities: human-written dialectal content degrades detection by 1.4–3.6% F1, while AI-generated content remains stable. Fine-tuned transformers substantially outperform zero-shot LLMs (96.6% vs. 78.3% best-case F1), with some models exhibiting catastrophic failures exceeding 33% degradation on mixed content. Cross-dialectal transfer analysis across 2,450 dialect pairs shows that multilingual models (mDeBERTa: 97.2% average F1) generalize effectively, while monolingual models like RoBERTa and XLM-RoBERTa fail on dialectal inputs. These findings demonstrate that current disinformation detectors may systematically disadvantage hundreds of millions of non- speakers worldwide. We release the benchmark, including the , and evaluation tools.
Anthology ID:
2026.acl-long.144
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3171–3214
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.144/
DOI:
Bibkey:
Cite (ACL):
Jason S Lucas, Matt Murtagh White, Ali Al-Lawati, Uchendu Uchendu, Adaku Uchendu, and Dongwon Lee. 2026. DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3171–3214, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects (Lucas et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.144.pdf
Checklist:
 2026.acl-long.144.checklist.pdf