AudioPrivacy: Parallel Audio Dataset for Speaker Profiling with Diverse Audio Types and Rich Attributes

Jiabei He, Yanzhe Zhang, Jiaming Zhou, Hui Wang, Haoqin Sun, Yong Qin


Abstract
Speech signals convey abundant speaker-related metadata, yet current privacy research predominantly focuses on identity-centric voiceprint protection, leaving sensitive Speaker Attribute Privacy (SAP) largely underexplored. This paper introduces AudioPrivacy, a large-scale Chinese dataset designed to systematically evaluate SAP leakage in realistic, everyday scenarios. Comprising 227.3 hours of audio from 1,000 speakers, it uniquely encompasses four parallel modalities: speech, singing, paralinguistic expressions, and non-vocal acoustic signals (e.g., footsteps). Annotated with 11 diverse attributes, including fine-grained physiological traits often overlooked in traditional corpora, AudioPrivacy enables a granular analysis of acoustic privacy risks. Our evaluations reveal significant leakage across multiple attributes, even when inferred from non-vocal signals. Furthermore, we demonstrate that state-of-the-art Multimodal Large Language Models (MM LLMs) can precisely profile speakers and exacerbate these risks, underscores the urgent need to rethink privacy-preserving mechanisms in the era of powerful audio foundation models.
Anthology ID:
2026.findings-acl.283
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5735–5748
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.283/
DOI:
Bibkey:
Cite (ACL):
Jiabei He, Yanzhe Zhang, Jiaming Zhou, Hui Wang, Haoqin Sun, and Yong Qin. 2026. AudioPrivacy: Parallel Audio Dataset for Speaker Profiling with Diverse Audio Types and Rich Attributes. In Findings of the Association for Computational Linguistics: ACL 2026, pages 5735–5748, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
AudioPrivacy: Parallel Audio Dataset for Speaker Profiling with Diverse Audio Types and Rich Attributes (He et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.283.pdf
Checklist:
 2026.findings-acl.283.checklist.pdf