STAMP-R: Stylometric Text Anonymization with Memory-guided Policy Rewriting

Zhan Shi, Yefeng Yuan, Liang Cheng, Yuhong Liu


Abstract
Modern machine learning systems rely heavily on large-scale textual data that often contain sensitive personal information. Although conventional anonymization techniques remove explicit identifiers, textual data remain vulnerable to authorship inference attacks that exploit persistent stylometric signals.Recent approaches leverage Large Language Models (LLMs) to rewrite text and obscure such signals, but they frequently overlook distinctive stylometric outliers and fail to achieve a favorable privacy–utility trade-off due to rigid, one-size-fits-all obfuscation strategies, while also incurring high computational costs.To address these challenges, we propose STAMP-R, a risk-adaptive reinforcement learning framework for instance-level authorship anonymization. We formulate anonymization as a risk-aware, instance-level style distribution shaping problem. Central to our approach is the Style Manifold Memory (SMM), which models the global stylistic landscape via prototype-based density estimation. SMM detects high-risk stylometric outliers and adaptively modulates a composite reward function, enabling stronger obfuscation for highly identifiable samples while preserving semantic fidelity for low-risk instances.We further distill a lightweight 3B-parameter model from a teacher LLM for efficient local deployment. Experiments show that STAMP-R reduces authorship re-identification risk while maintaining strong downstream utility.
Anthology ID:
2026.privatenlp-main.4
Volume:
Proceedings of the Seventh Workshop on Privacy in Natural Language Processing
Month:
July
Year:
2026
Address:
San Diego, California
Editors:
Ivan Habernal, Sepideh Ghanavati, Sara Haghighi, Krithika Ramesh, Timour Igamberdiev, Shomir Wilson
Venues:
PrivateNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
53–68
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.privatenlp-main.4/
DOI:
Bibkey:
Cite (ACL):
Zhan Shi, Yefeng Yuan, Liang Cheng, and Yuhong Liu. 2026. STAMP-R: Stylometric Text Anonymization with Memory-guided Policy Rewriting. In Proceedings of the Seventh Workshop on Privacy in Natural Language Processing, pages 53–68, San Diego, California. Association for Computational Linguistics.
Cite (Informal):
STAMP-R: Stylometric Text Anonymization with Memory-guided Policy Rewriting (Shi et al., PrivateNLP 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.privatenlp-main.4.pdf