Arvind Narayanan
2026
Localized Cultural Knowledge is Conserved and Controllable in Large Language Models
Veniamin Veselovsky | Berke Arg{\i}n | Benedikt Stroebl | Chris Wendler | Robert West | James Evans | Thomas L. Griffiths | Arvind Narayanan
Findings of the Association for Computational Linguistics: ACL 2026
Veniamin Veselovsky | Berke Arg{\i}n | Benedikt Stroebl | Chris Wendler | Robert West | James Evans | Thomas L. Griffiths | Arvind Narayanan
Findings of the Association for Computational Linguistics: ACL 2026
Large language models (LLMs), like human language learners, show patterns influenced by their dominant training language. Just as humans display language patterns influenced by their native tongue (semantic accents) when learning new languages, LLMs often default to English-centric responses even when generating in other languages. However, we observe that explicitly providing cultural context in prompts significantly improves the models’ ability to generate culturally localized responses. We term this phenomenon the explicit-implicit localization gap, indicating that while cultural knowledge exists within LLMs, it may not naturally surface in multilingual interaction without explicitly including cultural context. In this paper, we (1) quantify this gap in multiple LLMs using a new cultural localization benchmark and find large (>10%) gaps in the majority of investigated models. (2) Demonstrate a fundamental trade-off between localization accuracy and output diversity. (3) Through mechanistic interpretability, we identify the underlying localization mechanisms within LLMs and show that these mechanisms are both language and task agnostic, with individual steering vectors effectively generalizing across different languages and culturally-relevant tasks.