Zhihao Yao
2026
Adaptive Backtracking for Privacy Protection in Large Language Models
Zhihao Yao | Yuxuan Gu | Xiachong Feng | Weitao Ma | Bo Li | Xiaocheng Feng | Bing Qin
Findings of the Association for Computational Linguistics: ACL 2026
Zhihao Yao | Yuxuan Gu | Xiachong Feng | Weitao Ma | Bo Li | Xiaocheng Feng | Bing Qin
Findings of the Association for Computational Linguistics: ACL 2026
The privacy leakage problem has become a critical topic in large language models, especially in the scenario of retrieval augmented generation.Current defense methods mitigate privacy leakage but are still suffering from the trade-off between privacy protection and response availability.To address the problem, we propose to explicitly capture the latent leakage tendency of LLM during the generation process, which is able to protect privacy from a more fundamental perspective.In detail, we propose ABack, a training-free mechanism that synchronously monitors the decoding steps, derives the initial leakage intention via modeling mental states, and rewrites the response with privacy awareness. In addition, we construct a new benchmark especially for personally identifiable information, considering the lack of formal privacy datasets.Experiments show that ABack improves privacy by up to 14% over strong baselines against adversarial attacks, avoiding the degradation of response utility.