SNAG: Spoken Narratives and Gaze Dataset
Preethi Vaidyanathan, Emily T. Prud’hommeaux, Jeff B. Pelz, Cecilia O. Alm
Abstract
Humans rely on multiple sensory modalities when examining and reasoning over images. In this paper, we describe a new multimodal dataset that consists of gaze measurements and spoken descriptions collected in parallel during an image inspection task. The task was performed by multiple participants on 100 general-domain images showing everyday objects and activities. We demonstrate the usefulness of the dataset by applying an existing visual-linguistic data fusion framework in order to label important image regions with appropriate linguistic labels.- Anthology ID:
- P18-2022
- Volume:
- Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Editors:
- Iryna Gurevych, Yusuke Miyao
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 132–137
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/P18-2022/
- DOI:
- 10.18653/v1/P18-2022
- Cite (ACL):
- Preethi Vaidyanathan, Emily T. Prud’hommeaux, Jeff B. Pelz, and Cecilia O. Alm. 2018. SNAG: Spoken Narratives and Gaze Dataset. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 132–137, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- SNAG: Spoken Narratives and Gaze Dataset (Vaidyanathan et al., ACL 2018)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/P18-2022.pdf
- Data
- MS COCO