Investigating Language Preference of Multilingual RAG Systems

Jeonghyun Park, Hwanhee Lee


Abstract
Multilingual Retrieval-Augmented Generation (mRAG) systems enhance language models by integrating external multilingual information to produce context-aware responses. However, mRAG systems struggle with retrieving relevant information due to linguistic variations between queries and documents, generating inconsistent responses when multilingual sources conflict. In this work, we systematically investigate language preferences in both retrieval and generation of mRAG through a series of experiments. Our analysis indicates that retrievers tend to prefer high-resource and query languages, yet this preference does not consistently improve generation performance. Moreover, we observe that generators prefer the query language or Latin scripts, leading to inconsistent outputs. To overcome these issues, we propose Dual Knowledge Multilingual RAG (DKM-RAG), a simple yet effective framework that fuses translated multilingual passages with complementary model knowledge. Empirical results demonstrate that DKM-RAG mitigates language preference in generation and enhances performance across diverse linguistic settings. Code is available at https://github.com/jeonghyunpark2002/LanguagePreference.git
Anthology ID:
2025.findings-acl.295
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5647–5675
Language:
URL:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.295/
DOI:
Bibkey:
Cite (ACL):
Jeonghyun Park and Hwanhee Lee. 2025. Investigating Language Preference of Multilingual RAG Systems. In Findings of the Association for Computational Linguistics: ACL 2025, pages 5647–5675, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Investigating Language Preference of Multilingual RAG Systems (Park & Lee, Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.295.pdf