Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties

Fahim Faisal; Md Mushfiqur Rahman; Antonios Anastasopoulos

doi:10.18653/v1/2025.findings-emnlp.664

Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties

Fahim Faisal, Md Mushfiqur Rahman, Antonios Anastasopoulos

Abstract

There has been little systematic study on how dialectal differences affect toxicity detection by modern LLMs. Furthermore, although using LLMs as evaluators (“LLM-as-a-judge”) is a growing research area, their sensitivity to dialectal nuances is still underexplored and requires more focused attention. In this paper, we address these gaps through a comprehensive toxicity evaluation of LLMs across diverse dialects. We create a multi-dialect dataset through synthetic transformations and human-assisted translations, covering 10 language clusters and 60 varieties. We then evaluate five LLMs on their ability to assess toxicity, measuring multilingual, dialectal, and LLM-human consistency. Our findings show that LLMs are sensitive to both dialectal shifts and low-resource multilingual variation, though the most persistent challenge remains aligning their predictions with human judgments.

Anthology ID:: 2025.findings-emnlp.664
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12429–12452
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.664/
DOI:: 10.18653/v1/2025.findings-emnlp.664
Bibkey:
Cite (ACL):: Fahim Faisal, Md Mushfiqur Rahman, and Antonios Anastasopoulos. 2025. Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 12429–12452, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties (Faisal et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.664.pdf
Checklist:: 2025.findings-emnlp.664.checklist.pdf

PDF Cite Search Checklist Fix data