XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering

Keonwoo Roh, Yeong-Joon Ju, Seong-Whan Lee


Abstract
Large Language Models (LLMs) have shown significant progress in Open-domain question answering (ODQA), yet most evaluations focus on English and assume locale-invariant answers across languages. This assumption neglects the cultural and regional variations that affect question understanding and answer, leading to biased evaluation in multilingual benchmarks. To address these limitations, we introduce XLQA, a novel benchmark explicitly designed for locale-sensitive multilingual ODQA. XLQA contains 3,000 English seed questions expanded to eight languages, with careful filtering for semantic consistency and human-verified annotations distinguishing locale-invariant and locale-sensitive cases. Our evaluation of five state-of-the-art multilingual LLMs reveals notable failures on locale-sensitive questions, exposing gaps between English and other languages due to a lack of locale-grounding knowledge. We provide a systematic framework and scalable methodology for assessing multilingual QA under diverse cultural contexts, offering a critical resource to advance real-world applicability of multilingual ODQA systems. Our findings suggest that disparities in training data distribution contribute to differences in both linguistic competence and locale-awareness across models.
Anthology ID:
2025.emnlp-main.1466
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
28797–28809
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1466/
DOI:
Bibkey:
Cite (ACL):
Keonwoo Roh, Yeong-Joon Ju, and Seong-Whan Lee. 2025. XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 28797–28809, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
XLQA: A Benchmark for Locale-Aware Multilingual Open-Domain Question Answering (Roh et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1466.pdf
Checklist:
 2025.emnlp-main.1466.checklist.pdf