Weilong Dai
2025
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
Ziwei Huang
|
Wanggui He
|
Quanyu Long
|
Yandi Wang
|
Haoyuan Li
|
Zhelun Yu
|
Fangxun Shu
|
Weilong Dai
|
Hao Jiang
|
Fei Wu
|
Leilei Gan
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Most existing studies on evaluating text-to-image (T2I) models primarily focus on evaluating text-image alignment, image quality, and object composition capabilities, with comparatively fewer studies addressing the evaluation of the factuality of the synthesized images, particularly when the images involve knowledge-intensive concepts. In this work, we present T2I-FactualBench—the largest benchmark to date in terms of the number of concepts and prompts specifically designed to evaluate the factuality of knowledge-intensive concept generation. T2I-FactualBench consists of a three-tiered knowledge-intensive text-to-image generation framework, ranging from the basic memorization of individual knowledge concepts to the more complex composition of multiple knowledge concepts. We further introduce a multi-round visual question answering (VQA)-based evaluation framework to assesses the factuality of three-tiered knowledge-intensive text-to-image generation tasks. Experiments on T2I-FactualBench indicate that current state-of-the-art (SOTA) T2I models still leave significant room for improvement. We release our datasets and code at https://github.com/Safeoffellow/T2I-FactualBench.
Search
Fix author
Co-authors
- Leilei Gan 1
- Wanggui He 1
- Ziwei Huang 1
- Hao Jiang 1
- Haoyuan Li 1
- show all...
Venues
- acl1