Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs

Qiongkai Xu; Xuanli He; Lingjuan Lyu; Lizhen Qu; Gholamreza Haffari

Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs

Qiongkai Xu, Xuanli He, Lingjuan Lyu, Lizhen Qu, Gholamreza Haffari

Abstract

Machine-learning-as-a-service (MLaaS) has attracted millions of users to their splendid large-scale models. Although published as black-box APIs, the valuable models behind these services are still vulnerable to imitation attacks. Recently, a series of works have demonstrated that attackers manage to steal or extract the victim models. Nonetheless, none of the previous stolen models can outperform the original black-box APIs. In this work, we conduct unsupervised domain adaptation and multi-victim ensemble to showing that attackers could potentially surpass victims, which is beyond previous understanding of model extraction. Extensive experiments on both benchmark datasets and real-world APIs validate that the imitators can succeed in outperforming the original black-box models on transferred domains. We consider our work as a milestone in the research of imitation attack, especially on NLP APIs, as the superior performance could influence the defense or even publishing strategy of API providers.

Anthology ID:: 2022.coling-1.251
Volume:: Proceedings of the 29th International Conference on Computational Linguistics
Month:: October
Year:: 2022
Address:: Gyeongju, Republic of Korea
Venue:: COLING
SIG:
Publisher:: International Committee on Computational Linguistics
Note:
Pages:: 2849–2860
Language:
URL:: https://aclanthology.org/2022.coling-1.251
DOI:
Bibkey:
Cite (ACL):: Qiongkai Xu, Xuanli He, Lingjuan Lyu, Lizhen Qu, and Gholamreza Haffari. 2022. Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs. In Proceedings of the 29th International Conference on Computational Linguistics, pages 2849–2860, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):: Student Surpasses Teacher: Imitation Attack for Black-Box NLP APIs (Xu et al., COLING 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-script-update/2022.coling-1.251.pdf
Data: IMDb Movie Reviews, SST, WMT 2014

PDF Search