Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity

So Fukuda, Hayato Ogawa, Kaito Horio, Daisuke Kawahara, Tomohide Shibata


Abstract
To evaluate the creativity of large language models (LLMs) in Japanese, we construct three benchmarks: Japanese Creativity Questions (JCQ), Divergent Association Task (DAT), and Story Alteration Task (SAT). JCQ comprehensively evaluates creativity using LLMs. Meanwhile, DAT and SAT measure specific aspects of creative ability using embeddings. We also analyze correlations between JCQ and DAT, JCQ and SAT, and DAT and SAT. While JCQ provides comprehensive evaluation, it is relatively time and resource intensive. In contrast, DAT and SAT offer lower comprehensiveness but enable quick, low-cost assessment. Additionally, we investigate whether training with DAT contributes to enhancing LLM creativity.
Anthology ID:
2025.acl-srw.69
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Jin Zhao, Mingyang Wang, Zhu Liu
Venues:
ACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
939–957
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.acl-srw.69/
DOI:
Bibkey:
Cite (ACL):
So Fukuda, Hayato Ogawa, Kaito Horio, Daisuke Kawahara, and Tomohide Shibata. 2025. Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 939–957, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Building Japanese Creativity Benchmarks and Applying them to Enhance LLM Creativity (Fukuda et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.acl-srw.69.pdf