Discussion on domain generalization in the cross-device speaker verification system

Wei-Ting Lin, Yu-Jia Zhang, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan


Abstract
In this paper, we use domain generalization to improve the performance of the cross-device speaker verification system. Based on a trainable speaker verification system, we use domain generalization algorithms to fine-tune the model parameters. First, we use the VoxCeleb2 dataset to train ECAPA-TDNN as a baseline model. Then, use the CHT-TDSV dataset and the following domain generalization algorithms to fine-tune it: DANN, CDNN, Deep CORAL. Our proposed system tests 10 different scenarios in the NSYSU-TDSV dataset, including a single device and multiple devices. Finally, in the scenario of multiple devices, the best equal error rate decreased from 18.39 in the baseline to 8.84. Successfully achieved cross-device identification on the speaker verification system.
Anthology ID:
2021.rocling-1.12
Volume:
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)
Month:
October
Year:
2021
Address:
Taoyuan, Taiwan
Venue:
ROCLING
SIG:
Publisher:
The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Note:
Pages:
87–94
Language:
URL:
https://aclanthology.org/2021.rocling-1.12
DOI:
Bibkey:
Cite (ACL):
Wei-Ting Lin, Yu-Jia Zhang, Chia-Ping Chen, Chung-Li Lu, and Bo-Cheng Chan. 2021. Discussion on domain generalization in the cross-device speaker verification system. In Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021), pages 87–94, Taoyuan, Taiwan. The Association for Computational Linguistics and Chinese Language Processing (ACLCLP).
Cite (Informal):
Discussion on domain generalization in the cross-device speaker verification system (Lin et al., ROCLING 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.rocling-1.12.pdf