Towards Speaker Verification for Crowdsourced Speech Collections

John Mendonca, Rui Correia, Mariana Lourenço, João Freitas, Isabel Trancoso


Abstract
Crowdsourcing the collection of speech provides a scalable setting to access a customisable demographic according to each dataset’s needs. The correctness of speaker metadata is especially relevant for speaker-centred collections - ones that require the collection of a fixed amount of data per speaker. This paper identifies two different types of misalignment present in these collections: Multiple Accounts misalignment (different contributors map to the same speaker), and Multiple Speakers misalignment (multiple speakers map to the same contributor). Based on state-of-the-art approaches to Speaker Verification, this paper proposes an unsupervised method for measuring speaker metadata plausibility of a collection, i.e., evaluating the match (or lack thereof) between contributors and speakers. The solution presented is composed of an embedding extractor and a clustering module. Results indicate high precision in automatically classifying contributor alignment (>0.94).
Anthology ID:
2022.lrec-1.637
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5929–5937
Language:
URL:
https://aclanthology.org/2022.lrec-1.637
DOI:
Bibkey:
Cite (ACL):
John Mendonca, Rui Correia, Mariana Lourenço, João Freitas, and Isabel Trancoso. 2022. Towards Speaker Verification for Crowdsourced Speech Collections. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5929–5937, Marseille, France. European Language Resources Association.
Cite (Informal):
Towards Speaker Verification for Crowdsourced Speech Collections (Mendonca et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2022.lrec-1.637.pdf
Data
MUSAN