"Sorry, Can’t Help You": How Large Language Models Judge Failures to Help Across Languages

Pavithra P M Nair, Gilad Gressel, Krishnashree Achuthan


Abstract
Cross-cultural psychology has shown that moral judgments about failures to help vary systematically across cultures. In a landmark study, Miller, Bersoff, and Harwood (1990) found that while Indian and American participants agreed that failures to help are undesirable, they differed in whether they considered helping a moral obligation subject to social sanction or a personal decision. We adapt Miller et al.’s paradigm—nine scenarios crossing need severity (life-threatening, moderate, minor) with role relationship (parent, friend, stranger) and their original probe questions—to a cross-lingual LLM setting, presenting them to four LLMs (GPT-5.4, Claude-Opus-4.6, DeepSeek-V3.1, Qwen3-235B) across ten languages. We find that language significantly shapes how LLMs categorize failures to help as moral violations, social conventions, personal-moral concerns, or personal decisions (𝜒2(27) = 116.14, p < .001, Cramer’s V = 0.147). Models agree across languages that failures to help are undesirable, but diverge substantially in how they classify them, with the primary divergence falling between moral violations and personal decisions. The proportion of responses classifying failures as moral violations decreases as need severity decreases and the role relationship becomes more distant. Cross-lingual variation differs substantially across models, with open-weight models showing significantly stronger variation than closed-weight models. These findings indicate that users consulting LLMs in different languages may receive substantively different moral guidance, underscoring the need for cross-lingual normative auditing as a component of multilingual LLM evaluation.
Anthology ID:
2026.c3nlp-1.13
Volume:
Proceedings of the 4th Workshop on Cross-Cultural Considerations in NLP (C3NLP 2026)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Vinodkumar Prabhakaran, Sunipa Dev, Luciana Benotti, Daniel Hershcovich, Yong Cao, Li Zhou, BOlei Ma, Ife Adebara
Venues:
C3NLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
161–176
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.c3nlp-1.13/
DOI:
Bibkey:
Cite (ACL):
Pavithra P M Nair, Gilad Gressel, and Krishnashree Achuthan. 2026. "Sorry, Can’t Help You": How Large Language Models Judge Failures to Help Across Languages. In Proceedings of the 4th Workshop on Cross-Cultural Considerations in NLP (C3NLP 2026), pages 161–176, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
“Sorry, Can’t Help You”: How Large Language Models Judge Failures to Help Across Languages (Nair et al., C3NLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.c3nlp-1.13.pdf