SEEval: Advancing LLM Text Evaluation Efficiency and Accuracy through Self-Explanation Prompting
Meng-Chen Wu, Md Mosharaf Hossain, Tess Wood, Shayan Ali Akbar, Si-Chi Chin, Erwin Cornejo
Abstract
Large language models (LLMs) have achieved remarkable success in various natural language generation (NLG) tasks, but their performance in automatic text evaluation is not yet ready as human replacements. In this paper, we propose SEEval (Self-Explanation in Evaluation), a novel prompt-based text evaluator. Inspired by educational psychology, SEEval incorporates self-explanation, a metacognitive strategy, to enhance automatic text evaluation. Our experimental results show that SEEval, without probability normalization, is able to achieve competitive and often superior performance compared to the two state-of-the-art baselines – G-Eval and Analyze-Rate – across all evaluation dimensions and is 20 times more efficient in terms of run-time. The SEEval method is also generalizable as its results are consistent across three other selected LLMs – Claude 3.5 Sonnet, Command R+, and Mistral-Large 2.- Anthology ID:
- 2025.findings-naacl.411
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2025
- Month:
- April
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Luis Chiruzzo, Alan Ritter, Lu Wang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7357–7368
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.411/
- DOI:
- Cite (ACL):
- Meng-Chen Wu, Md Mosharaf Hossain, Tess Wood, Shayan Ali Akbar, Si-Chi Chin, and Erwin Cornejo. 2025. SEEval: Advancing LLM Text Evaluation Efficiency and Accuracy through Self-Explanation Prompting. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 7357–7368, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- SEEval: Advancing LLM Text Evaluation Efficiency and Accuracy through Self-Explanation Prompting (Wu et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.findings-naacl.411.pdf