SNAG: Spoken Narratives and Gaze Dataset

Preethi Vaidyanathan, Emily T. Prud’hommeaux, Jeff B. Pelz, Cecilia O. Alm

[How to correct problems with metadata yourself]


Abstract
Humans rely on multiple sensory modalities when examining and reasoning over images. In this paper, we describe a new multimodal dataset that consists of gaze measurements and spoken descriptions collected in parallel during an image inspection task. The task was performed by multiple participants on 100 general-domain images showing everyday objects and activities. We demonstrate the usefulness of the dataset by applying an existing visual-linguistic data fusion framework in order to label important image regions with appropriate linguistic labels.
Anthology ID:
P18-2022
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
132–137
Language:
URL:
https://aclanthology.org/P18-2022
DOI:
10.18653/v1/P18-2022
Bibkey:
Cite (ACL):
Preethi Vaidyanathan, Emily T. Prud’hommeaux, Jeff B. Pelz, and Cecilia O. Alm. 2018. SNAG: Spoken Narratives and Gaze Dataset. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 132–137, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
SNAG: Spoken Narratives and Gaze Dataset (Vaidyanathan et al., ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/P18-2022.pdf
Poster:
 P18-2022.Poster.pdf
Data
MS COCO