Translation Is Not Representation: English-Hub Routing in Cross-Lingual Bias Benchmarks

Hak Hyun Kim, Benjamin Huh


Abstract
Cross-lingual bias benchmarks such as JBBQ and KoBBQ translate English bias probes and compare scores across languages, assuming the translated probe measures the same construct. We test this assumption at the representation and behavioral levels using 13B-parameter models matched on architecture but differing in language-training regime. A multi-anchor logit lens shows that an English-centric model (Llama 2) processes Japanese and Korean inputs predominantly through English-script predictions in its middle layers, even where Centered Kernel Alignment (CKA) between languages is high: geometric convergence masks English-hub routing. Matched continual-adaptation comparisons show that target-language adaptation reduces this English-script mass: from 0.77 to 0.56 after Japanese adaptation (Swallow), and from 0.78 to 0.71 after Korean adaptation (koen), while balanced bilingual pretraining (LLM-jp) lowers it further to 0.19. Behaviorally, every model is more stereotype-biased in English than in Japanese, with gaps from 0.13 to 0.14, but this asymmetry is language-specific: in Korean it is weak and disappears after Korean adaptation, with Korean nearly as stereotype-leaning as English. Yet patching English hub states into target-language processing does not transplant this bias. Cross-lingual bias scores thus reflect genuine language-specific behavior, not an English-pivot artifact, even though the underlying representations are not comparable. We distill this dissociation between representation and behavior into a four-step audit protocol for translated bias benchmarks.
Anthology ID:
2026.stereacult-1.11
Volume:
Proceedings of the 1st Workshop on Stereotypes Across Cultures in Language Technologies (StereACuLT 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Weicheng Ma, Soroush Vosoughi, Nabeel Gillani, Rolando Coto-Solano
Venues:
StereACuLT | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
116–125
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.stereacult-1.11/
DOI:
Bibkey:
Cite (ACL):
Hak Hyun Kim and Benjamin Huh. 2026. Translation Is Not Representation: English-Hub Routing in Cross-Lingual Bias Benchmarks. In Proceedings of the 1st Workshop on Stereotypes Across Cultures in Language Technologies (StereACuLT 2026), pages 116–125, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Translation Is Not Representation: English-Hub Routing in Cross-Lingual Bias Benchmarks (Kim & Huh, StereACuLT 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.stereacult-1.11.pdf