2025
pdf
bib
abs
CENTAUR: Bridging the Impossible Trinity of Privacy, Efficiency, and Performance in Privacy-Preserving Transformer Inference
Jinglong Luo
|
Guanzhong Chen
|
Yehong Zhang
|
Shiyu Liu
|
Hui Wang
|
Yue Yu
|
Xun Zhou
|
Yuan Qi
|
Zenglin Xu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
With the growing deployment of pre-trained models like Transformers on cloud platforms, privacy concerns about model parameters and inference data are intensifying. Existing Privacy-Preserving Transformer Inference (PPTI) frameworks face the “impossible trinity” of balancing privacy, efficiency, and performance: Secure Multi-Party Computation (SMPC)-based approaches ensure strong privacy but suffer from high computational overhead and performance losses; Conversely, permutation-based methods achieve near-plaintext efficiency and accuracy but compromise privacy by exposing sensitive model parameters and intermediate results. Bridging this gap with a single approach presents substantial challenges, motivating the introduction of CENTAUR, a groundbreaking PPTI framework that seamlessly integrates random permutations and SMPC to address the “impossible trinity”. By designing efficient PPTI algorithms tailored to the structural properties of Transformer models, CENTAUR achieves an unprecedented balance among privacy, efficiency, and performance. Our experiments demonstrate CENTAUR’s ability to resist diverse data reconstruction attacks, achieve plaintext-level inference accuracy, and boost inference speed by 5.0~30.4 times, unlocking new possibilities for secure and efficient AI deployment.
pdf
bib
abs
COPR: Continual Human Preference Learning via Optimal Policy Regularization
Han Zhang
|
Lin Gui
|
Yu Lei
|
Yuanzhao Zhai
|
Yehong Zhang
|
Zhuo Zhang
|
Yulan He
|
Hui Wang
|
Yue Yu
|
Kam-Fai Wong
|
Bin Liang
|
Ruifeng Xu
Findings of the Association for Computational Linguistics: ACL 2025
Reinforcement Learning from Human Feedback (RLHF) is effective for aligning Large Language Models (LLMs) with human preferences. However, RLHF’s complex process limits its ability to continually learn human feedback, making it impractical for real-world applications where the deployed model continuously receives feedback from users. The non-RL-based method, such as Direct Preference Optimization (DPO), is not primitively favorable for Continual Learning (CL). We observe that when combined with Experiment Relay (ER) for CL, DPO tends to significantly widen the gap in the probability of human-preferred and dispreferred responses. Consequently, this diminishes the diversity in model generation, potentially leading to model collapse. To overcome the above challenges, we propose the Continual Optimal Policy Regularization (COPR), a novel non-RL offline method to convert the historical optimal policies into optimization constraints when continually learning new preferences. We first derive a moderate reward function from the pairwise ranking loss and then use the moderate reward to calculate a new sampling distribution to construct novel learning objectives and constraints. We also provide formal proof of the learnability of COPR. The experimental results show that COPR outperforms strong CL baselines on our proposed benchmark, in terms of reward-based, GPT-4 evaluations and human assessment.
2024
pdf
bib
SecFormer: Fast and Accurate Privacy-Preserving Inference for Transformer Models via SMPC
Jinglong Luo
|
Yehong Zhang
|
Zhuo Zhang
|
Jiaqi Zhang
|
Xin Mu
|
Hui Wang
|
Yue Yu
|
Zenglin Xu
Findings of the Association for Computational Linguistics: ACL 2024