Deep learning evaluation using deep linguistic processing

Alexander Kuhnle, Ann Copestake


Abstract
We discuss problems with the standard approaches to evaluation for tasks like visual question answering, and argue that artificial data can be used to address these as a complement to current practice. We demonstrate that with the help of existing ‘deep’ linguistic processing technology we are able to create challenging abstract datasets, which enable us to investigate the language understanding abilities of multimodal deep learning models in detail, as compared to a single performance value on a static and monolithic dataset.
Anthology ID:
W18-1003
Volume:
Proceedings of the Workshop on Generalization in the Age of Deep Learning
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Yonatan Bisk, Omer Levy, Mark Yatskar
Venue:
Gen-Deep
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17–23
Language:
URL:
https://aclanthology.org/W18-1003
DOI:
10.18653/v1/W18-1003
Bibkey:
Cite (ACL):
Alexander Kuhnle and Ann Copestake. 2018. Deep learning evaluation using deep linguistic processing. In Proceedings of the Workshop on Generalization in the Age of Deep Learning, pages 17–23, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Deep learning evaluation using deep linguistic processing (Kuhnle & Copestake, Gen-Deep 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/proper-vol2-ingestion/W18-1003.pdf
Data
CLEVRMS COCONLVRSHAPESShapeWorldVisual Question AnsweringVisual Question Answering v2.0