Adaptive Backtracking for Privacy Protection in Large Language Models
Zhihao Yao, Yuxuan Gu, Xiachong Feng, Weitao Ma, Bo Li, Xiaocheng Feng, Bing Qin
Abstract
The privacy leakage problem has become a critical topic in large language models, especially in the scenario of retrieval augmented generation.Current defense methods mitigate privacy leakage but are still suffering from the trade-off between privacy protection and response availability.To address the problem, we propose to explicitly capture the latent leakage tendency of LLM during the generation process, which is able to protect privacy from a more fundamental perspective.In detail, we propose ABack, a training-free mechanism that synchronously monitors the decoding steps, derives the initial leakage intention via modeling mental states, and rewrites the response with privacy awareness. In addition, we construct a new benchmark especially for personally identifiable information, considering the lack of formal privacy datasets.Experiments show that ABack improves privacy by up to 14% over strong baselines against adversarial attacks, avoiding the degradation of response utility.- Anthology ID:
- 2026.findings-acl.1857
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 37278–37298
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1857/
- DOI:
- Cite (ACL):
- Zhihao Yao, Yuxuan Gu, Xiachong Feng, Weitao Ma, Bo Li, Xiaocheng Feng, and Bing Qin. 2026. Adaptive Backtracking for Privacy Protection in Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 37278–37298, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Adaptive Backtracking for Privacy Protection in Large Language Models (Yao et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1857.pdf