Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach

Ruizhe Li, Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao


Abstract
Creative writing is a key capability of Large Language Models (LLMs), with potential applications in literature, storytelling, and various creative domains. However, evaluating the creativity of machine-generated texts remains a significant challenge, as existing methods either rely on costly manual annotations or fail to align closely with human assessments. In this paper, we propose an effective automated evaluation method based on the Torrance Test of Creative Writing (TTCW), which evaluates creativity as product. Our method employs a reference-based Likert-style approach, scoring generated creative texts relative to high-quality reference texts across various tests. Experimental results demonstrate that our method significantly improves the alignment between LLM evaluations and human assessments, achieving a pairwise accuracy of 0.75 (+15%).
Anthology ID:
2025.findings-emnlp.1171
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
21475–21488
Language:
URL:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1171/
DOI:
10.18653/v1/2025.findings-emnlp.1171
Bibkey:
Cite (ACL):
Ruizhe Li, Chiwei Zhu, Benfeng Xu, Xiaorui Wang, and Zhendong Mao. 2025. Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 21475–21488, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Automated Creativity Evaluation for Large Language Models: A Reference-Based Approach (Li et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1171.pdf
Checklist:
 2025.findings-emnlp.1171.checklist.pdf