Enhancing Multilingual RAG Systems with Debiased Language Preference-Guided Query Fusion

Jeonghyun Park, Byeongjeong Kim, Seojin Hwang, Hwanhee Lee


Abstract
Multilingual Retrieval-Augmented Generation (mRAG) systems often exhibit a perceived preference for high-resource languages, particularly English, resulting in the widespread adoption of English pivoting. While prior studies attribute this advantage to the superior English-centric capabilities of Large Language Models (LLMs), we find that such measurements are significantly distorted by structural priors inherent in evaluation benchmarks. Specifically, we identify exposure bias and a gold availability prior—both driven by the disproportionate concentration of resources in English—as well as cultural priors rooted in topic locality, as factors that hinder accurate assessment of genuine language preference. To address these biases, we propose DeLP (Debiased Language Preference), a calibrated metric designed to explicitly factor out these structural confounds. Our analysis using DeLP reveals that the previously reported English preference is largely a byproduct of evidence distribution rather than an inherent model bias. Instead, we find that retrievers fundamentally favor monolingual alignment between the query and the document language. Building on this insight, we introduce DELTA (DEbiased Language preference–guided Text Augmentation), a lightweight and efficient mRAG framework that strategically leverages monolingual alignment to optimize cross-lingual retrieval and generation. Experimental results demonstrate that DELTA consistently outperforms English pivoting and mRAG baselines across diverse languages.
Anthology ID:
2026.findings-acl.1353
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
27116–27136
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1353/
DOI:
Bibkey:
Cite (ACL):
Jeonghyun Park, Byeongjeong Kim, Seojin Hwang, and Hwanhee Lee. 2026. Enhancing Multilingual RAG Systems with Debiased Language Preference-Guided Query Fusion. In Findings of the Association for Computational Linguistics: ACL 2026, pages 27116–27136, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Enhancing Multilingual RAG Systems with Debiased Language Preference-Guided Query Fusion (Park et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1353.pdf
Checklist:
 2026.findings-acl.1353.checklist.pdf