Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents

Ivoline C. Ngong; Swanand Ravindra Kadhe; Hao Wang (汪浩); Keerthiram Murugesan; Justin D. Weisz; Amit Dhurandhar; Karthikeyan Natesan Ramamurthy

doi:10.18653/v1/2025.findings-acl.1343

Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents

Ivoline C. Ngong, Swanand Ravindra Kadhe, Hao Wang, Keerthiram Murugesan, Justin D. Weisz, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy

Abstract

Conversational agents are increasingly woven into individuals’ personal lives, yet users often underestimate the privacy risks associated with them. The moment users share information with these agents —such as large language models (LLMs)— their private information becomes vulnerable to exposure. In this paper, we characterize the notion of contextual privacy for user interactions with LLM-based Conversational Agents (LCAs). It aims to minimize privacy risks by ensuring that users (sender) disclose only information that is both relevant and necessary for achieving their intended goals when interacting with LCAs (untrusted receivers). Through a formative design user study, we observe how even “privacy-conscious” users inadvertently reveal sensitive information through indirect disclosures. Based on insights from this study, we propose a locally deployable framework that operates between users and LCAs, identifying and reformulating out-of-context information in user prompts. Our evaluation using examples from ShareGPT shows that lightweight models can effectively implement this framework, achieving strong gains in contextual privacy while preserving the user’s intended interaction goals. Notably, about 76% of participants in our human evaluation preferred the reformulated prompts over the original ones, validating the usability and effectiveness of contextual privacy in our proposed framework. We open source the code at https://github.com/IBM/contextual-privacy-LLM.

Anthology ID:: 2025.findings-acl.1343
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 26196–26220
Language:
URL:: https://preview.aclanthology.org/transition-to-people-yaml/2025.findings-acl.1343/
DOI:: 10.18653/v1/2025.findings-acl.1343
Bibkey:
Cite (ACL):: Ivoline C. Ngong, Swanand Ravindra Kadhe, Hao Wang, Keerthiram Murugesan, Justin D. Weisz, Amit Dhurandhar, and Karthikeyan Natesan Ramamurthy. 2025. Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents. In Findings of the Association for Computational Linguistics: ACL 2025, pages 26196–26220, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents (Ngong et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/transition-to-people-yaml/2025.findings-acl.1343.pdf

PDF Cite Search Fix data