Abstract
It has been demonstrated that hidden representation learned by deep model can encode private information of the input, hence can be exploited to recover such information with reasonable accuracy. To address this issue, we propose a novel approach called Differentially Private Neural Representation (DPNR) to preserve privacy of the extracted representation from text. DPNR utilises Differential Privacy (DP) to provide formal privacy guarantee. Further, we show that masking words via dropout can further enhance privacy. To maintain utility of the learned representation, we integrate DP-noisy representation into a robust training process to derive a robust target model, which also helps for model fairness over various demographic variables. Experimental results on benchmark datasets under various parameter settings demonstrate that DPNR largely reduces privacy leakage without significantly sacrificing the main task performance.- Anthology ID:
- 2020.findings-emnlp.213
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2020
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Trevor Cohn, Yulan He, Yang Liu
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2355–2365
- Language:
- URL:
- https://aclanthology.org/2020.findings-emnlp.213
- DOI:
- 10.18653/v1/2020.findings-emnlp.213
- Cite (ACL):
- Lingjuan Lyu, Xuanli He, and Yitong Li. 2020. Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 2355–2365, Online. Association for Computational Linguistics.
- Cite (Informal):
- Differentially Private Representation for NLP: Formal Guarantee and An Empirical Study on Privacy and Fairness (Lyu et al., Findings 2020)
- PDF:
- https://preview.aclanthology.org/gem-23-ingestion/2020.findings-emnlp.213.pdf
- Code
- xlhex/dpnlp + additional community code