Worldwide LiveVQA: Real-Time Visual Knowledge Seeking and Updating Across Languages
Xuanao Huang, Xingjia Liu, Zetong Zhou, Yuyang Peng, Yao Wan, Dongping Chen
Abstract
Knowledge about the visual world is not only constantly evolving but also inherently happening all over the world: breaking news in Tokyo, political events in São Paulo, and cultural phenomena in Cairo are first reported in Japanese, Portuguese, and Arabic, carrying regional context that English-centric resources cannot fully capture. Yet existing resources for visual knowledge remain confined to English, creating a "Worldwide Knowledge Gap" that hinders developing truly global assistants. To quantify this gap, we introduce LiveVQA-W(orldwide), the first dynamic-updating dataset for real-time, multilingual visual knowledge seeking and updating across ten major languages. Drawing from worldwide news outlets, YouTube videos, and academic platforms during August–December 2025, LiveVQA-W comprises 234K images, 873K questions, and 171K visual entities with hierarchical evaluation: Level 1 for visual entity recognition and Level 2 for multi-hop cross-lingual reasoning. Our comprehensive benchmarking of 15 state-of-the-art MLLMs reveals that models without search achieve near-random performance, while search-augmented models exhibit severe linguistic bias, with English accuracy nearly double that of other languages. Furthermore, we explore visual knowledge updating through large-scale training, finding that injected knowledge improves recall but remains fragile under prompt rephrasing and image perturbations such as rotation and flipping. We release the fully replicable data collection pipeline and raw dataset to support continuous community-driven expansion. The benchmark, code, and related resources are available at: https://worldwide-livevqa.github.io.- Anthology ID:
- 2026.findings-acl.1984
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 39819–39894
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1984/
- DOI:
- Cite (ACL):
- Xuanao Huang, Xingjia Liu, Zetong Zhou, Yuyang Peng, Yao Wan, and Dongping Chen. 2026. Worldwide LiveVQA: Real-Time Visual Knowledge Seeking and Updating Across Languages. In Findings of the Association for Computational Linguistics: ACL 2026, pages 39819–39894, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Worldwide LiveVQA: Real-Time Visual Knowledge Seeking and Updating Across Languages (Huang et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1984.pdf