Xiao Zinan

2026

MonCulture-Eval: A Hierarchical Benchmark for Evaluating Mongolian Cultural Capabilities of Large Language Models across Scripts and Regions
Quulgan Minggad | Xiao Zinan | Yuan Sun
Findings of the Association for Computational Linguistics: ACL 2026

While Large Language Models (LLMs) have achieved impressive linguistic fluency in low-resource languages, their capacity to process deep cultural nuances remains insufficiently quantified. This paper introduces MonCulture-Eval, a benchmark designed to assess the cultural intelligence of LLMs in the Mongolian context across two writing systems (Traditional and Cyrillic) and three regional sub-cultures (Alxa, Ordos, and Horqin). Curated entirely from primary, non-digitized archives to prevent data contamination, the benchmark employs a three-layer cognitive hierarchy—Factual, Situational, and Values—supplemented by specialized tasks including Riddles, Taboos, and Proverbs. Evaluation of frontier models reveals a severe "Script Gap" and a systematic "Etic Bias," where models sanitize spiritual rituals into secular functional norms.

Co-authors

Quulgan Minggad 1
Yuan Sun 1

Venues

Findings1

Fix author