The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue

Janosch Haber, Tim Baumgärtner, Ece Takmaz, Lieke Gelderloos, Elia Bruni, Raquel Fernández


Abstract
This paper introduces the PhotoBook dataset, a large-scale collection of visually-grounded, task-oriented dialogues in English designed to investigate shared dialogue history accumulating during conversation. Taking inspiration from seminal work on dialogue analysis, we propose a data-collection task formulated as a collaborative game prompting two online participants to refer to images utilising both their visual context as well as previously established referring expressions. We provide a detailed description of the task setup and a thorough analysis of the 2,500 dialogues collected. To further illustrate the novel features of the dataset, we propose a baseline model for reference resolution which uses a simple method to take into account shared information accumulated in a reference chain. Our results show that this information is particularly important to resolve later descriptions and underline the need to develop more sophisticated models of common ground in dialogue interaction.
Anthology ID:
P19-1184
Original:
P19-1184v1
Version 2:
P19-1184v2
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1895–1910
Language:
URL:
https://aclanthology.org/P19-1184
DOI:
10.18653/v1/P19-1184
Bibkey:
Cite (ACL):
Janosch Haber, Tim Baumgärtner, Ece Takmaz, Lieke Gelderloos, Elia Bruni, and Raquel Fernández. 2019. The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1895–1910, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue (Haber et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/P19-1184.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-3/P19-1184.mp4
Data
PhotoBookMS COCOVisual Question Answering