Diversity and Consistency: Exploring Visual Question-Answer Pair Generation
Sen Yang, Qingyu Zhou, Dawei Feng, Yang Liu, Chao Li, Yunbo Cao, Dongsheng Li
Abstract
Although showing promising values to downstream applications, generating question and answer together is under-explored. In this paper, we introduce a novel task that targets question-answer pair generation from visual images. It requires not only generating diverse question-answer pairs but also keeping the consistency of them. We study different generation paradigms for this task and propose three models: the pipeline model, the joint model, and the sequential model. We integrate variational inference into these models to achieve diversity and consistency. We also propose region representation scaling and attention alignment to improve the consistency further. We finally devise an evaluator as a quantitative metric for consistency. We validate our approach on two benchmarks, VQA2.0 and Visual-7w, by automatically and manually evaluating diversity and consistency. Experimental results show the effectiveness of our models: they can generate diverse or consistent pairs. Moreover, this task can be used to improve visual question generation and visual question answering.- Anthology ID:
- 2021.findings-emnlp.91
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2021
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Venue:
- Findings
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1053–1066
- Language:
- URL:
- https://aclanthology.org/2021.findings-emnlp.91
- DOI:
- 10.18653/v1/2021.findings-emnlp.91
- Cite (ACL):
- Sen Yang, Qingyu Zhou, Dawei Feng, Yang Liu, Chao Li, Yunbo Cao, and Dongsheng Li. 2021. Diversity and Consistency: Exploring Visual Question-Answer Pair Generation. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 1053–1066, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Diversity and Consistency: Exploring Visual Question-Answer Pair Generation (Yang et al., Findings 2021)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2021.findings-emnlp.91.pdf
- Data
- VQG, Visual Question Answering v2.0, Visual7W