Yang Liu

Other people with similar names: Yang Janet Liu (Georgetown University; 刘洋), Yang Liu, Yang Liu, Yang Liu, Yang Liu (3M Health Information Systems), Yang Liu, Yang Liu, Yang Liu, Yang Liu, Yang Liu, Yang Liu (Beijing Language and Culture University), Yang Liu (National University of Defense Technology), Yang Liu (Edinburgh Ph.D., Microsoft), Yang Liu (University of Helsinki), Yang Liu (The Chinese University of Hong Kong (Shenzhen)), Yang Liu (刘扬) (刘扬; Ph.D Purdue; ICSI, Dallas, Facebook, Liulishuo, Amazon), Yang Liu (刘洋) (刘洋; ICT, Tsinghua, Beijing Academy of Artificial Intelligence), Yang Liu (Microsoft Cognitive Services Research), Yang Liu (刘扬) (Peking University), Yang Liu (Samsung Research Center Beijing), Yang Liu (Tianjin University, China), Yang Liu (Univ. of Michigan, UC Santa Cruz), Yang Liu (Wilfrid Laurier University)

Unverified author pages with similar names: Yang Liu


2025

pdf bib
Knowledge Image Matters: Improving Knowledge-Based Visual Reasoning with Multi-Image Large Language Models
Guanghui Ye | Huan Zhao | Zhixue Zhao | Xupeng Zha | Yang Liu | Zhihua Jiang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We revisit knowledge-based visual reasoning (KB-VR) in light of modern advances in multimodal large language models (MLLMs), and make the following contributions: (i) We propose Visual Knowledge Card (VKC) – a novel image that incorporates not only internal visual knowledge (e.g., scene-aware information) detected from the raw image, but also external world knowledge (e.g., attribute or object knowledge) produced by a knowledge generator; (ii) We present VKC-based Multi-Image Reasoning (VKC-MIR) – a four-stage pipeline which harnesses a state-of-the-art scene perception engine to construct an initial VKC (Stage-1), a powerful LLM to generate relevant domain knowledge (Stage-2), an excellent image editing toolkit to introduce generated knowledge into the updated VKC (Stage-3), and finally, an emerging multi-image MLLM to solve the VKC-enhanced task (Stage-4). By performing experiments on three popular KB-VR benchmarks, our approach achieves new state-of-the-art results compared to previous top-performing models.