Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability

Chiwei Zhu; Benfeng Xu; An Yang; Junyang Lin; Quan Wang; Chang Zhou; Zhendong Mao

Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability

Chiwei Zhu, Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Zhendong Mao

Abstract

Training language models with rationales augmentation has been shown to be beneficial in many existing works. In this paper, we identify that such a prevailing view does not hold consistently. We conduct comprehensive investigations to thoroughly inspect the impact of rationales on model performance as well as a novel perspective of model reliability. The results lead to several key findings that add new insights upon existing understandings: 1) Rationales can, at times, deteriorate model performance; 2) Rationales can, at times, improve model reliability, even outperforming their untrained counterparts; 3) A linear correspondence exists in between the performance and reliability improvements, while both are driven by the intrinsic difficulty of the task. These findings provide informative regulations on the broad utilization of rationales and raise critical implications on the procedure of explicitly aligning language models with implicit human thoughts. Codes can be found in this anonymous link: https://anonymous.4open.science/r/rationales-CEE8.

Anthology ID:: 2025.findings-acl.302
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5808–5835
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.302/
DOI:
Bibkey:
Cite (ACL):: Chiwei Zhu, Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, and Zhendong Mao. 2025. Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability. In Findings of the Association for Computational Linguistics: ACL 2025, pages 5808–5835, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability (Zhu et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.302.pdf

PDF Cite Search Fix data