Benjamin Huh

2026

Lost in Translation: Cross-Cultural Bias in LLM-Assisted Medical Symptom Interpretation
Yuting Tian | Salar Khaleghzadegan | Benjamin Huh | Yash Raj | Gena Heng
Proceedings of the 1st Workshop on Stereotypes Across Cultures in Language Technologies (StereACuLT 2026)

Large language models (LLMs) are increasingly used to convert patient language into clinical-style summaries, yet patient symptom descriptions may vary across linguistic, cultural, and cross-linguistic contexts. In this pilot study, we operationalize this variation using four expression styles: direct English, indirect English, culturally mediated English, and Chinese-original patient language. We propose a compact red-teaming framework for testing whether LLM-based symptom interpretation changes when the same underlying concern is expressed in different linguistic and cultural forms. Our pilot dataset contains eight symptom scenarios, each expressed in four styles, yielding 32 vignettes before prompt variation. We evaluate GPT-5 mini as a pilot case-study model under generic and culture-aware prompts, repeating the full evaluation three times to produce 192 model outputs. Reference labels and a stratified subset of model output annotations were reviewed for face validity by an independent reviewer with clinical training.The model usually preserves broad symptom categories, but subtle failure modes emerge. Culture-aware prompting reduces severity downgrades from 14.6% to 9.4% and ambiguity-flagging failures from 28.1% to 13.5%, but does not reduce interpretation inconsistency or clinical category shift, both of which remain at 6.2%. Indirect English shows the highest severity-downgrade and flagging-failure rates, while Chinese-original expressions are often interpreted with the correct broad category but are not consistently flagged as ambiguous. These findings suggest that medical LLM evaluation should assess cultural robustness, severity framing, ambiguity preservation, and human-review escalation in addition to factual accuracy.

pdf bib abs

Translation Is Not Representation: English-Hub Routing in Cross-Lingual Bias Benchmarks
Hak Hyun Kim | Benjamin Huh
Proceedings of the 1st Workshop on Stereotypes Across Cultures in Language Technologies (StereACuLT 2026)

Cross-lingual bias benchmarks such as JBBQ and KoBBQ translate English bias probes and compare scores across languages, assuming the translated probe measures the same construct. We test this assumption at the representation and behavioral levels using 13B-parameter models matched on architecture but differing in language-training regime. A multi-anchor logit lens shows that an English-centric model (Llama 2) processes Japanese and Korean inputs predominantly through English-script predictions in its middle layers, even where Centered Kernel Alignment (CKA) between languages is high: geometric convergence masks English-hub routing. Matched continual-adaptation comparisons show that target-language adaptation reduces this English-script mass: from 0.77 to 0.56 after Japanese adaptation (Swallow), and from 0.78 to 0.71 after Korean adaptation (koen), while balanced bilingual pretraining (LLM-jp) lowers it further to 0.19. Behaviorally, every model is more stereotype-biased in English than in Japanese, with gaps from 0.13 to 0.14, but this asymmetry is language-specific: in Korean it is weak and disappears after Korean adaptation, with Korean nearly as stereotype-leaning as English. Yet patching English hub states into target-language processing does not transplant this bias. Cross-lingual bias scores thus reflect genuine language-specific behavior, not an English-pivot artifact, even though the underlying representations are not comparable. We distill this dissociation between representation and behavior into a four-step audit protocol for translated bias benchmarks.

Co-authors

Venues

StereACuLT2
WS2

Fix author