以三元組損失微調時延神經網路語者嵌入函數之語者辨識系統(Time Delay Neural Network-based Speaker Embedding Function Fine-tuned with Triplet Loss for Distance-based Speaker Recognition)
Chih-Ting Yehn, Po-Chin Wang, Su-Yu Zhang, Chia-Ping Chen, Shan-Wen Hsiao, Bo-Cheng Chan, Chung-li Lu
- Anthology ID:
- 2019.rocling-1.29
- Volume:
- Proceedings of the 31st Conference on Computational Linguistics and Speech Processing (ROCLING 2019)
- Month:
- October
- Year:
- 2019
- Address:
- New Taipei City, Taiwan
- Venue:
- ROCLING
- SIG:
- Publisher:
- The Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
- Note:
- Pages:
- 310–324
- Language:
- Chinese
- URL:
- https://aclanthology.org/2019.rocling-1.29
- DOI:
- Cite (ACL):
- Chih-Ting Yehn, Po-Chin Wang, Su-Yu Zhang, Chia-Ping Chen, Shan-Wen Hsiao, Bo-Cheng Chan, and Chung-li Lu. 2019. 以三元組損失微調時延神經網路語者嵌入函數之語者辨識系統(Time Delay Neural Network-based Speaker Embedding Function Fine-tuned with Triplet Loss for Distance-based Speaker Recognition). In Proceedings of the 31st Conference on Computational Linguistics and Speech Processing (ROCLING 2019), pages 310–324, New Taipei City, Taiwan. The Association for Computational Linguistics and Chinese Language Processing (ACLCLP).
- Cite (Informal):
- 以三元組損失微調時延神經網路語者嵌入函數之語者辨識系統(Time Delay Neural Network-based Speaker Embedding Function Fine-tuned with Triplet Loss for Distance-based Speaker Recognition) (Yehn et al., ROCLING 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2019.rocling-1.29.pdf
- Data
- MUSAN, VoxCeleb1, VoxCeleb2