Know What You Don’t Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories

Sina Zarrieß, David Schlangen


Abstract
Zero-shot learning in Language & Vision is the task of correctly labelling (or naming) objects of novel categories. Another strand of work in L&V aims at pragmatically informative rather than “correct” object descriptions, e.g. in reference games. We combine these lines of research and model zero-shot reference games, where a speaker needs to successfully refer to a novel object in an image. Inspired by models of “rational speech acts”, we extend a neural generator to become a pragmatic speaker reasoning about uncertain object categories. As a result of this reasoning, the generator produces fewer nouns and names of distractor categories as compared to a literal speaker. We show that this conversational strategy for dealing with novel objects often improves communicative success, in terms of resolution accuracy of an automatic listener.
Anthology ID:
P19-1063
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
654–659
Language:
URL:
https://aclanthology.org/P19-1063
DOI:
10.18653/v1/P19-1063
Bibkey:
Cite (ACL):
Sina Zarrieß and David Schlangen. 2019. Know What You Don’t Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 654–659, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Know What You Don’t Know: Modeling a Pragmatic Speaker that Refers to Objects of Unknown Categories (Zarrieß & Schlangen, ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/P19-1063.pdf
Data
COCORefCOCO