Describe Me an Auklet: Generating Grounded Perceptual Category Descriptions

Bill Noble; Nikolai Ilinykh

doi:10.18653/v1/2023.emnlp-main.580

Describe Me an Auklet: Generating Grounded Perceptual Category Descriptions

Abstract

Human speakers can generate descriptions of perceptual concepts, abstracted from the instance-level. Moreover, such descriptions can be used by other speakers to learn provisional representations of those concepts. Learning and using abstract perceptual concepts is under-investigated in the language-and-vision field. The problem is also highly relevant to the field of representation learning in multi-modal NLP. In this paper, we introduce a framework for testing category-level perceptual grounding in multi-modal language models. In particular, we train separate neural networks to **generate** and **interpret** descriptions of visual categories. We measure the *communicative success* of the two models with the zero-shot classification performance of the interpretation model, which we argue is an indicator of perceptual grounding. Using this framework, we compare the performance of *prototype*- and *exemplar*-based representations. Finally, we show that communicative success exposes performance issues in the generation model, not captured by traditional intrinsic NLG evaluation metrics, and argue that these issues stem from a failure to properly ground language in vision at the category level.

Anthology ID:: 2023.emnlp-main.580
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9330–9347
Language:
URL:: https://preview.aclanthology.org/add-emnlp-2024-awards/2023.emnlp-main.580/
DOI:: 10.18653/v1/2023.emnlp-main.580
Bibkey:
Cite (ACL):: Bill Noble and Nikolai Ilinykh. 2023. Describe Me an Auklet: Generating Grounded Perceptual Category Descriptions. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 9330–9347, Singapore. Association for Computational Linguistics.
Cite (Informal):: Describe Me an Auklet: Generating Grounded Perceptual Category Descriptions (Noble & Ilinykh, EMNLP 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/add-emnlp-2024-awards/2023.emnlp-main.580.pdf
Video:: https://preview.aclanthology.org/add-emnlp-2024-awards/2023.emnlp-main.580.mp4

PDF Cite Search Video Fix data