DP-BART for Privatized Text Rewriting under Local Differential Privacy

Timour Igamberdiev, Ivan Habernal


Abstract
Privatized text rewriting with local differential privacy (LDP) is a recent approach that enables sharing of sensitive textual documents while formally guaranteeing privacy protection to individuals. However, existing systems face several issues, such as formal mathematical flaws, unrealistic privacy guarantees, privatization of only individual words, as well as a lack of transparency and reproducibility. In this paper, we propose a new system ‘DP-BART’ that largely outperforms existing LDP systems. Our approach uses a novel clipping method, iterative pruning, and further training of internal representations which drastically reduces the amount of noise required for DP guarantees. We run experiments on five textual datasets of varying sizes, rewriting them at different privacy guarantees and evaluating the rewritten texts on downstream text classification tasks. Finally, we thoroughly discuss the privatized text rewriting approach and its limitations, including the problem of the strict text adjacency constraint in the LDP paradigm that leads to the high noise requirement.
Anthology ID:
2023.findings-acl.874
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13914–13934
Language:
URL:
https://aclanthology.org/2023.findings-acl.874
DOI:
10.18653/v1/2023.findings-acl.874
Bibkey:
Cite (ACL):
Timour Igamberdiev and Ivan Habernal. 2023. DP-BART for Privatized Text Rewriting under Local Differential Privacy. In Findings of the Association for Computational Linguistics: ACL 2023, pages 13914–13934, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
DP-BART for Privatized Text Rewriting under Local Differential Privacy (Igamberdiev & Habernal, Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2023.findings-acl.874.pdf