REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance
Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Nouha Dziri, Dan Jurafsky, Maarten Sap
Abstract
The ability to communicate uncertainty and knowledge limitations is crucial for the safety of large language models (LLMs). Current evaluations of these abilities typically examine the correspondence between model accuracy and its internal probabilities or linguistic outputs. However, evaluation of the uncertainty of LLM communication should also focus on the behaviors of their human interlocutors: how much do users rely on what the LLM says? We introduce an interaction-centered evaluation approach called Rel-A.I. (pronounced “rely”) that quantifies whether and how humans rely on LLMs’ responses, complementing existing calibration evaluations. Through nine user studies with 450 participants, we investigate three crucial aspects that influence user reliance. We show that emphatic expressions of politeness (e.g., “I’m happy to help!”) that precede LLM answers will cause participants to perceive these models as more competent, and in turn, rely 30% more on their generations. Additionally, the context of the interaction, such as the knowledge domain and nature of previous interactions with the LLM, substantially influences user reliance (e.g., users will rely 10% more on LLMs when responding to questions involving calculations). Our results show that calibration and language quality alone are insufficient in informing which LLMs are safely calibrated, and illustrate the need to consider features of the interactional context.- Anthology ID:
- 2025.naacl-long.556
- Volume:
- Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
- Month:
- April
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Luis Chiruzzo, Alan Ritter, Lu Wang
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11148–11167
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.556/
- DOI:
- Cite (ACL):
- Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Nouha Dziri, Dan Jurafsky, and Maarten Sap. 2025. REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 11148–11167, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- REL-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance (Zhou et al., NAACL 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.556.pdf