Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection

Sheng Li, Jiyi Li, Qianying Liu, Zhuo Gong


Abstract
With the advent of the General Data Protection Regulation (GDPR) and increasing privacy concerns, the sharing of speech data is faced with significant challenges. Protecting the sensitive content of speech is the same important as the voiceprint. This paper proposes an effective speech content protection method by constructing a frame-by-frame adversarial speech generation system. We revisited the adversarial examples generating method in the recent machine learning field and selected the phonetic state sequence of sensitive speech for the adversarial examples generation. We build an adversarial speech collection. Moreover, based on the speech collection, we proposed a neural network-based frame-by-frame mapping method to recover the speech content by converting from the adversarial speech to the human speech. Experiment shows our proposed method can encode and recover any sensitive audio, and our method is easy to be conducted with publicly available resources of speech recognition technology.
Anthology ID:
2022.lrec-1.792
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
7291–7297
Language:
URL:
https://aclanthology.org/2022.lrec-1.792
DOI:
Bibkey:
Cite (ACL):
Sheng Li, Jiyi Li, Qianying Liu, and Zhuo Gong. 2022. Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 7291–7297, Marseille, France. European Language Resources Association.
Cite (Informal):
Adversarial Speech Generation and Natural Speech Recovery for Speech Content Protection (Li et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2022.lrec-1.792.pdf
Data
LibriSpeech