QACE: Asking Questions to Evaluate an Image Caption
Hwanhee Lee, Thomas Scialom, Seunghyun Yoon, Franck Dernoncourt, Kyomin Jung
Abstract
In this paper we propose QACE, a new metric based on Question Answering for Caption Evaluation to evaluate image captioning based on Question Generation(QG) and Question Answering(QA) systems. QACE generates questions on the evaluated caption and check its content by asking the questions on either the reference caption or the source image. We first develop QACE_Ref that compares the answers of the evaluated caption to its reference, and report competitive results with the state-of-the-art metrics. To go further, we propose QACE_Img, that asks the questions directly on the image, instead of reference. A Visual-QA system is necessary for QACE_Img. Unfortunately, the standard VQA models are actually framed a classification among only few thousands categories. Instead, we propose Visual-T5, an abstractive VQA system. The resulting metric, QACE_Img is multi-modal, reference-less and explainable. Our experiments show that QACE_Img compares favorably w.r.t. other reference-less metrics.- Anthology ID:
- 2021.findings-emnlp.395
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4631–4638
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.395
- DOI:
- 10.18653/v1/2021.findings-emnlp.395
- Cite (ACL):
- Hwanhee Lee, Thomas Scialom, Seunghyun Yoon, Franck Dernoncourt, and Kyomin Jung. 2021. QACE: Asking Questions to Evaluate an Image Caption. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4631–4638, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- QACE: Asking Questions to Evaluate an Image Caption (Lee et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.findings-emnlp.395.pdf
- Code
- hwanheelee1993/qace
- Data
- SQuAD, Visual Question Answering