Towards Speaker Verification for Crowdsourced Speech Collections
John Mendonca, Rui Correia, Mariana Lourenço, João Freitas, Isabel Trancoso
Abstract
Crowdsourcing the collection of speech provides a scalable setting to access a customisable demographic according to each dataset’s needs. The correctness of speaker metadata is especially relevant for speaker-centred collections - ones that require the collection of a fixed amount of data per speaker. This paper identifies two different types of misalignment present in these collections: Multiple Accounts misalignment (different contributors map to the same speaker), and Multiple Speakers misalignment (multiple speakers map to the same contributor). Based on state-of-the-art approaches to Speaker Verification, this paper proposes an unsupervised method for measuring speaker metadata plausibility of a collection, i.e., evaluating the match (or lack thereof) between contributors and speakers. The solution presented is composed of an embedding extractor and a clustering module. Results indicate high precision in automatically classifying contributor alignment (>0.94).- Anthology ID:
- 2022.lrec-1.637
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 5929–5937
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.637
- DOI:
- Cite (ACL):
- John Mendonca, Rui Correia, Mariana Lourenço, João Freitas, and Isabel Trancoso. 2022. Towards Speaker Verification for Crowdsourced Speech Collections. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5929–5937, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Towards Speaker Verification for Crowdsourced Speech Collections (Mendonca et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2022.lrec-1.637.pdf
- Data
- MUSAN