Disentangling Language and Culture for Evaluating Multilingual Large Language Models

Jiahao Ying; Wei Tang; Yiran Zhao; Yixin Cao; Yu Rong; Wenxuan Zhang

Disentangling Language and Culture for Evaluating Multilingual Large Language Models

Jiahao Ying, Wei Tang, Yiran Zhao, Yixin Cao, Yu Rong, Wenxuan Zhang

Abstract

This paper introduces a Dual Evaluation Framework to comprehensively assess the multilingual capabilities of LLMs. By decomposing the evaluation along the dimensions of linguistic medium and cultural context, this framework enables a nuanced analysis of LLMs’ ability to process questions within both native and cross-cultural contexts cross-lingually. Extensive evaluations are conducted on a wide range of models, revealing a notable “Cultural-Linguistic Synergy” phenomenon, where models exhibit better performance when questions are culturally aligned with the language. This phenomenon is further explored through interpretability probing, which shows that a higher proportion of specific neurons are activated in a language’s cultural context. This activation proportion could serve as a potential indicator for evaluating multilingual performance during model training. Our findings challenge the prevailing notion that LLMs, primarily trained on English data, perform uniformly across languages and highlight the necessity of culturally and linguistically model evaluations.

Anthology ID:: 2025.acl-long.1082
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 22230–22251
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1082/
DOI:
Bibkey:
Cite (ACL):: Jiahao Ying, Wei Tang, Yiran Zhao, Yixin Cao, Yu Rong, and Wenxuan Zhang. 2025. Disentangling Language and Culture for Evaluating Multilingual Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 22230–22251, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Disentangling Language and Culture for Evaluating Multilingual Large Language Models (Ying et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1082.pdf

PDF Cite Search Fix data