How (un)faithful are explainable LLM-based NLG metrics?

Alex Terentowicz, Mateusz Lango, Ondrej Dusek


Abstract
Explainable NLG metrics are becoming a popular research topic; however, the faithfulness of the explanations they provide is typically not evaluated. In this work, we propose a testbed for assessing the faithfulness of span-based metrics by performing controlled perturbations of their explanations and observing changes in the final score. We show that several popular LLM evaluators do not consistently produce faithful explanations.
Anthology ID:
2025.inlg-main.37
Volume:
Proceedings of the 18th International Natural Language Generation Conference
Month:
October
Year:
2025
Address:
Hanoi, Vietnam
Editors:
Lucie Flek, Shashi Narayan, Lê Hồng Phương, Jiahuan Pei
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
617–658
Language:
URL:
https://preview.aclanthology.org/author-page-you-zhang-rochester/2025.inlg-main.37/
DOI:
Bibkey:
Cite (ACL):
Alex Terentowicz, Mateusz Lango, and Ondrej Dusek. 2025. How (un)faithful are explainable LLM-based NLG metrics?. In Proceedings of the 18th International Natural Language Generation Conference, pages 617–658, Hanoi, Vietnam. Association for Computational Linguistics.
Cite (Informal):
How (un)faithful are explainable LLM-based NLG metrics? (Terentowicz et al., INLG 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-you-zhang-rochester/2025.inlg-main.37.pdf