TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference
Xin Zhou, Yi Lu, Ruotian Ma, Tao Gui, Qi Zhang, Xuanjing Huang
Abstract
Pre-trained language models (PLMs) are often deployed as cloud services, enabling users to upload textual data and perform inference remotely. However, users’ personal text often contains sensitive information, and sharing such data directly with the service providers can lead to serious privacy leakage. To address this problem, we introduce a novel privacy-preserving inference framework called MixPi , which prevents plaintext leakage during the inference phase. Inspired by k-anonymity, MixPi aims to obfuscate a user’s private input by mixing it with multiple other inputs, thereby confounding potential privacy attackers. To achieve this, our approach involves: (1) proposing a novel encryption module, Privacy Mixer, which encrypts input from three distinct dimensions: mixing, representation, and position. (2) adopting a pre-trained Multi-input Multi-output network to handle mixed representations and obtain multiple predictions. (3) employing a Privacy Demixer to ensure only the user can decrypt the real output among the multiple predictions. Furthermore, we explore different ways to automatically generate synthetic inputs required for mixing. Experimental results on token and sentence classification tasks demonstrate that MixPi greatly surpasses existing privacy-preserving methods in both performance and privacy.- Anthology ID:
- 2023.findings-emnlp.244
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3749–3762
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.244
- DOI:
- 10.18653/v1/2023.findings-emnlp.244
- Cite (ACL):
- Xin Zhou, Yi Lu, Ruotian Ma, Tao Gui, Qi Zhang, and Xuanjing Huang. 2023. TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3749–3762, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- TextMixer: Mixing Multiple Inputs for Privacy-Preserving Inference (Zhou et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2023.findings-emnlp.244.pdf