Abstract
Wit is a form of rich interaction that is often grounded in a specific situation (e.g., a comment in response to an event). In this work, we attempt to build computational models that can produce witty descriptions for a given image. Inspired by a cognitive account of humor appreciation, we employ linguistic wordplay, specifically puns, in image descriptions. We develop two approaches which involve retrieving witty descriptions for a given image from a large corpus of sentences, or generating them via an encoder-decoder neural network architecture. We compare our approach against meaningful baseline approaches via human studies and show substantial improvements. Moreover, in a Turing test style evaluation, people find the image descriptions generated by our model to be slightly wittier than human-written witty descriptions when the human is subject to similar constraints as the model regarding word usage and style.- Anthology ID:
- N18-2121
- Volume:
- Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Editors:
- Marilyn Walker, Heng Ji, Amanda Stent
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 770–775
- Language:
- URL:
- https://aclanthology.org/N18-2121
- DOI:
- 10.18653/v1/N18-2121
- Cite (ACL):
- Arjun Chandrasekaran, Devi Parikh, and Mohit Bansal. 2018. Punny Captions: Witty Wordplay in Image Descriptions. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 770–775, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Punny Captions: Witty Wordplay in Image Descriptions (Chandrasekaran et al., NAACL 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/N18-2121.pdf