Through the Looking Glass of Multilingual AI: Contrasting Language- and Name Script-Dependent Ethnic Hierarchies in GPT and DeepSeek

Annabella Sakunkoo; Jonathan Sakunkoo

Through the Looking Glass of Multilingual AI: Contrasting Language- and Name Script-Dependent Ethnic Hierarchies in GPT and DeepSeek

Abstract

Large language models (LLMs) are increasingly used as evaluative tools across languages, yet bias research remains overwhelmingly Anglocentric, with most studies conducted in English using Latin-script names. It remains unclear whether bias patterns generalize across linguistic contexts. We investigate this question and introduce the stereotype perceptual map, a framework for analyzing how ethnic groups are positioned along evaluative dimensions.Using 900,000 model responses over 45,000 name variations spanning 9 ethnicities, we evaluate model behavior across prompt languages (English, Chinese, Thai), writing scripts (Latin, Chinese, Thai), evaluative domains (competence, warmth), and models (GPT, DeepSeek). We find that ethnic bias hierarchies are jointly shaped by local linguistic context and model origin and differ substantially between Western-centric and Sinocentric models.DeepSeek exhibits highly stable rankings across conditions in math competence judgments, consistently placing Chinese at the top, followed by Russian, and White, Hispanic, and Black names at the bottom. GPT, by contrast, shows strong script-dependent reordering: Latin-script conditions form one stable cluster, while native-script conditions form another, with substantially lower cross-cluster correlations. We term this script-gated bias: transliterating the same names into a non-Latin script can activate a different evaluative frame and produce rankings that are sometimes inversely correlated with Latin-script results. Warmth evaluations are less stable than competence across both models.Our findings demonstrate that multilingual bias cannot be characterized through single-language, single-script audits. For multilingual users, code-switching between languages can toggle between different bias regimes. Fairness evaluations for multilingual LLMs must therefore account for deployment language, writing system, and model origin to capture the full range of potentially harmful bias these systems exhibit.

Anthology ID:: 2026.acl-srw.96
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1103–1114
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-srw.96/
DOI:
Bibkey:
Cite (ACL):: Annabella Sakunkoo and Jonathan Sakunkoo. 2026. Through the Looking Glass of Multilingual AI: Contrasting Language- and Name Script-Dependent Ethnic Hierarchies in GPT and DeepSeek. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 1103–1114, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Through the Looking Glass of Multilingual AI: Contrasting Language- and Name Script-Dependent Ethnic Hierarchies in GPT and DeepSeek (Sakunkoo & Sakunkoo, ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-srw.96.pdf

PDF Cite Search Fix data