Seeing the World through Text: Evaluating Image Descriptions for Commonsense Reasoning in Machine Reading Comprehension

Diana Galván-Sosa; Jun Suzuki; Kyosuke Nishida; Koji Matsuda; Kentaro Inui

Seeing the World through Text: Evaluating Image Descriptions for Commonsense Reasoning in Machine Reading Comprehension

Diana Galvan-Sosa, Jun Suzuki, Kyosuke Nishida, Koji Matsuda, Kentaro Inui

Abstract

Despite recent achievements in natural language understanding, reasoning over commonsense knowledge still represents a big challenge to AI systems. As the name suggests, common sense is related to perception and as such, humans derive it from experience rather than from literary education. Recent works in the NLP and the computer vision field have made the effort of making such knowledge explicit using written language and visual inputs, respectively. Our premise is that the latter source fits better with the characteristics of commonsense acquisition. In this work, we explore to what extent the descriptions of real-world scenes are sufficient to learn common sense about different daily situations, drawing upon visual information to answer script knowledge questions.

Anthology ID:: 2020.lantern-1.3
Volume:: Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN)
Month:: December
Year:: 2020
Address:: Barcelona, Spain
Editors:: Aditya Mogadala, Sandro Pezzelle, Dietrich Klakow, Marie-Francine Moens, Zeynep Akata
Venue:: LANTERN
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 23–29
Language:
URL:: https://aclanthology.org/2020.lantern-1.3
DOI:
Bibkey:
Cite (ACL):: Diana Galvan-Sosa, Jun Suzuki, Kyosuke Nishida, Koji Matsuda, and Kentaro Inui. 2020. Seeing the World through Text: Evaluating Image Descriptions for Commonsense Reasoning in Machine Reading Comprehension. In Proceedings of the Second Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN), pages 23–29, Barcelona, Spain. Association for Computational Linguistics.
Cite (Informal):: Seeing the World through Text: Evaluating Image Descriptions for Commonsense Reasoning in Machine Reading Comprehension (Galvan-Sosa et al., LANTERN 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2020.lantern-1.3.pdf
Data: MCScript, Visual Genome

PDF Search