Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation
Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Wang, Miguel Eckstein, William Wang
Abstract
The field of text-to-image (T2I) generation has garnered significant attention both within the research community and among everyday users. Despite the advancements of T2I models, a common issue encountered by users is the need for repetitive editing of input prompts in order to receive a satisfactory image, which is time-consuming and labor-intensive. Given the demonstrated text generation power of large-scale language models, such as GPT-k, we investigate the potential of utilizing such models to improve the prompt editing process for T2I generation. We conduct a series of experiments to compare the common edits made by humans and GPT-k, evaluate the performance of GPT-k in prompting T2I, and examine factors that may influence this process. We found that GPT-k models focus more on inserting modifiers while humans tend to replace words and phrases, which includes changes to the subject matter. Experimental results show that GPT-k are more effective in adjusting modifiers rather than predicting spontaneous changes in the primary subject matters. Adopting the edit suggested by GPT-k models may reduce the percentage of remaining edits by 20-30%.- Anthology ID:
- 2023.emnlp-main.685
- Volume:
- Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11113–11122
- Language:
- URL:
- https://aclanthology.org/2023.emnlp-main.685
- DOI:
- 10.18653/v1/2023.emnlp-main.685
- Cite (ACL):
- Wanrong Zhu, Xinyi Wang, Yujie Lu, Tsu-Jui Fu, Xin Wang, Miguel Eckstein, and William Wang. 2023. Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 11113–11122, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation (Zhu et al., EMNLP 2023)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2023.emnlp-main.685.pdf