Overcoming Language Priors with Counterfactual Inference for Visual Question Answering
Ren Zhibo, Wang Huizhen, Zhu Muhua, Wang Yichao, Xiao Tong, Zhu Jingbo
Abstract
“Recent years have seen a lot of efforts in attacking the issue of language priors in the field ofVisual Question Answering (VQA). Among the extensive efforts, causal inference is regarded asa promising direction to mitigate language bias by weakening the direct causal effect of questionson answers. In this paper, we follow the same direction and attack the issue of language priorsby incorporating counterfactual data. Moreover, we propose a two-stage training strategy whichis deemed to make better use of counterfactual data. Experiments on the widely used bench-mark VQA-CP v2 demonstrate the effectiveness of the proposed approach, which improves thebaseline by 21.21% and outperforms most of the previous systems.”- Anthology ID:
- 2023.ccl-1.52
- Volume:
- Proceedings of the 22nd Chinese National Conference on Computational Linguistics
- Month:
- August
- Year:
- 2023
- Address:
- Harbin, China
- Editors:
- Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, Xianpei Han
- Venue:
- CCL
- SIG:
- Publisher:
- Chinese Information Processing Society of China
- Note:
- Pages:
- 600–610
- Language:
- English
- URL:
- https://aclanthology.org/2023.ccl-1.52
- DOI:
- Cite (ACL):
- Ren Zhibo, Wang Huizhen, Zhu Muhua, Wang Yichao, Xiao Tong, and Zhu Jingbo. 2023. Overcoming Language Priors with Counterfactual Inference for Visual Question Answering. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 600–610, Harbin, China. Chinese Information Processing Society of China.
- Cite (Informal):
- Overcoming Language Priors with Counterfactual Inference for Visual Question Answering (Zhibo et al., CCL 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2023.ccl-1.52.pdf