GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions
Ting-Yao Hsu, Chieh-Yang Huang, Ryan Rossi, Sungchul Kim, C. Giles, Ting-Hao Huang
Abstract
There is growing interest in systems that generate captions for scientific figures. However, assessing these systems’ output poses a significant challenge. Human evaluation requires academic expertise and is costly, while automatic evaluation depends on often low-quality author-written captions. This paper investigates using large language models (LLMs) as a cost-effective, reference-free method for evaluating figure captions. We first constructed SCICAP-EVAL, a human evaluation dataset that contains human judgments for 3,600 scientific figure captions, both original and machine-made, for 600 arXiv figures. We then prompted LLMs like GPT-4 and GPT-3 to score (1-6) each caption based on its potential to aid reader understanding, given relevant context such as figure-mentioning paragraphs. Results show that GPT-4, used as a zero-shot evaluator, outperformed all other models and even surpassed assessments made by computer science undergraduates, achieving a Kendall correlation score of 0.401 with Ph.D. students’ rankings.- Anthology ID:
- 2023.findings-emnlp.363
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5464–5474
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.363
- DOI:
- 10.18653/v1/2023.findings-emnlp.363
- Cite (ACL):
- Ting-Yao Hsu, Chieh-Yang Huang, Ryan Rossi, Sungchul Kim, C. Giles, and Ting-Hao Huang. 2023. GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 5464–5474, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions (Hsu et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/corrections-2024-07/2023.findings-emnlp.363.pdf