Curse of bilinguality: Evaluating monolingual and bilingual language models on Chinese linguistic benchmarks

Yuwen Zhou, Yevgen Matusevych


Abstract
We investigate cross-lingual transfer effects in large language models (LLMs) trained on two high-resource languages, English and Chinese. Four monolingual Chinese and four bilingual English–Chinese models are evaluated on two Chinese linguistic benchmarks. The monolingual models consistently outperform the bilingual ones on 12 out of 55 tasks, while the reverse is true for only 4 tasks, highlighting the prevalence of negative (rather than positive) transfer from English to Chinese. Additionally, we carry out a feature attribution analysis in a monolingual and a bilingual model, showing that the differences in their performance may be explained by more predictable attribution patterns in the monolingual model. Our findings have implications for the ongoing effort of training bilingual LLMs.
Anthology ID:
2025.gem-1.58
Volume:
Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Month:
July
Year:
2025
Address:
Vienna, Austria and virtual meeting
Editors:
Kaustubh Dhole, Miruna Clinciu
Venues:
GEM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
622–630
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.gem-1.58/
DOI:
Bibkey:
Cite (ACL):
Yuwen Zhou and Yevgen Matusevych. 2025. Curse of bilinguality: Evaluating monolingual and bilingual language models on Chinese linguistic benchmarks. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 622–630, Vienna, Austria and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
Curse of bilinguality: Evaluating monolingual and bilingual language models on Chinese linguistic benchmarks (Zhou & Matusevych, GEM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.gem-1.58.pdf