Large Language Models can Share Images, Too!

Young-Jun Lee; Dokyong Lee; Joo Won Sung; Jonghwan Hyeon; Ho-Jin Choi

Large Language Models can Share Images, Too!

Young-Jun Lee, Dokyong Lee, Joo Won Sung, Jonghwan Hyeon, Ho-Jin Choi

Abstract

This paper explores the image-sharing capability of Large Language Models (LLMs), such as GPT-4 and LLaMA 2, in a zero-shot setting. To facilitate a comprehensive evaluation of LLMs, we introduce the photochatplus dataset, which includes enriched annotations (ie intent, triggering sentence, image description, and salient information). Furthermore, we present the gradient-free and extensible Decide, Describe, and Retrieve () framework. With extensive experiments, we unlock the image-sharing capability of equipped with LLMs in zero-shot prompting, with ChatGPT achieving the best performance.Our findings also reveal the emergent image-sharing ability in LLMs under zero-shot conditions, validating the effectiveness of . We use this framework to demonstrate its practicality and effectiveness in two real-world scenarios: (1) human-bot interaction and (2) dataset augmentation. To the best of our knowledge, this is the first study to assess the image-sharing ability of various LLMs in a zero-shot setting. We make our source code and dataset publicly available at https://github.com/passing2961/DribeR.

Anthology ID:: 2024.findings-acl.39
Volume:: Findings of the Association for Computational Linguistics ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand and virtual meeting
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 692–713
Language:
URL:: https://aclanthology.org/2024.findings-acl.39
DOI:
Bibkey:
Cite (ACL):: Young-Jun Lee, Dokyong Lee, Joo Won Sung, Jonghwan Hyeon, and Ho-Jin Choi. 2024. Large Language Models can Share Images, Too!. In Findings of the Association for Computational Linguistics ACL 2024, pages 692–713, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):: Large Language Models can Share Images, Too! (Lee et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.39.pdf

PDF Search