Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks

Eileen Pan; Anna Seo Gyeong Choi; Maartje Ter Hoeve; Skyler Seto; Allison Koenecke

doi:10.18653/v1/2025.findings-emnlp.1139

Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks

Eileen Pan, Anna Seo Gyeong Choi, Maartje Ter Hoeve, Skyler Seto, Allison Koenecke

Abstract

Large language models (LLMs) are ubiquitous in modern day natural language processing. However, previous work has shown degraded LLM performance for under-represented English dialects. We analyze the effects of typifying “standard” American English language questions as non-”standard” dialectal variants on multiple choice question answering tasks and find up to a 20% reduction in accuracy. Additionally, we investigate the grammatical basis of under-performance in non-”standard” English questions. We find that individual grammatical rules have varied effects on performance, but some are more consequential than others: three specific grammar rules (existential “it”, zero copula, and y’all) can explain the majority of performance degradation observed in multiple dialects. We call for future work to investigate bias mitigation methods focused on individual, high-impact grammatical structures.

Anthology ID:: 2025.findings-emnlp.1139
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 20882–20893
Language:
URL:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1139/
DOI:: 10.18653/v1/2025.findings-emnlp.1139
Bibkey:
Cite (ACL):: Eileen Pan, Anna Seo Gyeong Choi, Maartje Ter Hoeve, Skyler Seto, and Allison Koenecke. 2025. Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 20882–20893, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Analyzing Dialectical Biases in LLMs for Knowledge and Reasoning Benchmarks (Pan et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1139.pdf
Checklist:: 2025.findings-emnlp.1139.checklist.pdf

PDF Cite Search Checklist Fix data