Shintaro Sakai


2026

Large language models (LLMs) are increasingly used for mental health applications, raising questions about whether they reflect established clinical knowledge. Clinical psychology has documented systematic cultural differences in the presentation of depression symptoms, with Western populations emphasizing emotional symptoms and many East Asian populations reporting more somatic symptoms. We evaluate whether general-purpose LLMs reproduce these clinically established cross-cultural patterns using prompts grounded in clinical descriptions of depression. We examine model responses under different cultural personas and languages.We find that LLMs struggle to reproduce expected cultural patterns when prompted in English. Prompting in major Eastern languages improves alignment in some configurations, suggesting that language cues partially activate cultural knowledge. However, model behavior remains dominated by a strong, culture-invariant hierarchy of depression symptoms that often overrides cultural cues, highlighting limitations in current LLMs for mental health applications.