Wei-Ting Lin


2022

pdf bib
探討語者驗證系統中特徵處理模組與注意力機制 (Investigation of Feature Processing Modules and Attention Mechanisms in Speaker Verification System) [In Chinese]
Ting-Wei Chen | Wei-Ting Lin | Chia-Ping Chen | Chung-Li Lu | Bo-Cheng Chan | Yu-Han Cheng | Hsiang-Feng Chuang | Wei-Yu Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 27, Number 2, December 2022

pdf
Investigation of feature processing modules and attention mechanisms in speaker verification system
Ting-Wei Chen | Wei-Ting Lin | Chia-Ping Chen | Chung-Li Lu | Bo-Cheng Chan | Yu-Han Cheng | Hsiang-Feng Chuang | Wei-Yu Chen
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)

In this paper, we use several combinations of feature front-end modules and attention mechanisms to improve the performance of our speaker verification system. An updated version of ECAPA-TDNN is chosen as a baseline. We replace and integrate different feature front-end and attention mechanism modules to compare and find the most effective model design, and this model would be our final system. We use VoxCeleb 2 dataset as our training set, and test the performance of our models on several test sets. With our final proposed model, we improved performance by 16% over baseline on VoxSRC2022 valudation set, achieving better results for our speaker verification system.

2021

pdf
Discussion on domain generalization in the cross-device speaker verification system
Wei-Ting Lin | Yu-Jia Zhang | Chia-Ping Chen | Chung-Li Lu | Bo-Cheng Chan
Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing (ROCLING 2021)

In this paper, we use domain generalization to improve the performance of the cross-device speaker verification system. Based on a trainable speaker verification system, we use domain generalization algorithms to fine-tune the model parameters. First, we use the VoxCeleb2 dataset to train ECAPA-TDNN as a baseline model. Then, use the CHT-TDSV dataset and the following domain generalization algorithms to fine-tune it: DANN, CDNN, Deep CORAL. Our proposed system tests 10 different scenarios in the NSYSU-TDSV dataset, including a single device and multiple devices. Finally, in the scenario of multiple devices, the best equal error rate decreased from 18.39 in the baseline to 8.84. Successfully achieved cross-device identification on the speaker verification system.

2020

pdf
Exploring Disparate Language Model Combination Strategies for Mandarin-English Code-Switching ASR
Wei-Ting Lin | Berlin Chen
Proceedings of the 32nd Conference on Computational Linguistics and Speech Processing (ROCLING 2020)