On a Utilitarian Approach to Privacy Preserving Text Generation

Zekun Xu, Abhinav Aggarwal, Oluwaseyi Feyisetan, Nathanael Teissier


Abstract
Differentially-private mechanisms for text generation typically add carefully calibrated noise to input words and use the nearest neighbor to the noised input as the output word. When the noise is small in magnitude, these mechanisms are susceptible to reconstruction of the original sensitive text. This is because the nearest neighbor to the noised input is likely to be the original input. To mitigate this empirical privacy risk, we propose a novel class of differentially private mechanisms that parameterizes the nearest neighbor selection criterion in traditional mechanisms. Motivated by Vickrey auction, where only the second highest price is revealed and the highest price is kept private, we balance the choice between the first and the second nearest neighbors in the proposed class of mechanisms using a tuning parameter. This parameter is selected by empirically solving a constrained optimization problem for maximizing utility, while maintaining the desired privacy guarantees. We argue that this empirical measurement framework can be used to align different mechanisms along a common benchmark for their privacy-utility tradeoff, particularly when different distance metrics are used to calibrate the amount of noise added. Our experiments on real text classification datasets show up to 50% improvement in utility compared to the existing state-of-the-art with the same empirical privacy guarantee.
Anthology ID:
2021.privatenlp-1.2
Volume:
Proceedings of the Third Workshop on Privacy in Natural Language Processing
Month:
June
Year:
2021
Address:
Online
Editors:
Oluwaseyi Feyisetan, Sepideh Ghanavati, Shervin Malmasi, Patricia Thaine
Venue:
PrivateNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11–20
Language:
URL:
https://aclanthology.org/2021.privatenlp-1.2
DOI:
10.18653/v1/2021.privatenlp-1.2
Bibkey:
Cite (ACL):
Zekun Xu, Abhinav Aggarwal, Oluwaseyi Feyisetan, and Nathanael Teissier. 2021. On a Utilitarian Approach to Privacy Preserving Text Generation. In Proceedings of the Third Workshop on Privacy in Natural Language Processing, pages 11–20, Online. Association for Computational Linguistics.
Cite (Informal):
On a Utilitarian Approach to Privacy Preserving Text Generation (Xu et al., PrivateNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2021.privatenlp-1.2.pdf
Data
IMDb Movie Reviews