Tat-Thang Vu

Also published as: Tat Thang Vu


2016

pdf
The IOIT English ASR system for IWSLT 2016
Van Huy Nguyen | Trung-Nghia Phung | Tat Thang Vu | Chi Mai Luong
Proceedings of the 13th International Conference on Spoken Language Translation

This paper describes the speech recognition system of IOIT for IWSLT 2016. Four single DNN-based systems were developed to produce the 1st-pass lattices for the test sets using a baseline language model. The 2nd-pass lattices were further obtained by applying N-best list rescoring on topic adapted language models which were constructed from closed topic sentences by applying a text selection method. The final transcriptions of test sets were finally produced by combining the rescored results. On the 2013 evaluation set, we are able to reduce the word error rate of 1.62% absolute. On the 2014, provided as a development set, the word error rate of our transcription is 11.3%.

2015

pdf
The IOIT English ASR system for IWSLT 2015
Van Huy Nguyen | Quoc Bao Nguyen | Tat Thang Vu | Chi Mai Luong
Proceedings of the 12th International Workshop on Spoken Language Translation: Evaluation Campaign

2014

pdf
The speech recognition systems of IOIT for IWSLT 2014
Quoc Bao Nguyen | Tat Thang Vu | Chi Mai Luong
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the speech recognition systems of IOIT for IWSLT 2014 TED ASR track. This year, we focus on improving acoustic model for the systems using two main approaches of deep neural network which are hybrid and bottleneck feature systems. These two subsystems are combined using lattice Minimum Bayes-Risk decoding. On the 2013 evaluations set, which serves as a progress test set, we were able to reduce the word error rate of our transcription systems from 27.2% to 24.0%, a relative reduction of 11.7%.

2013

pdf
The speech recognition and machine translation system of IOIT for IWSLT 2013
Ngoc-Quan Pham | Hai-Son Le | Tat-Thang Vu | Chi-Mai Luong
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the Automatic Speech Recognition (ASR) and Machine Translation (MT) systems developed by IOIT for the evaluation campaign of IWSLT2013. For the ASR task, using Kaldi toolkit, we developed the system based on weighted finite state transducer. The system is constructed by applying several techniques, notably, subspace Gaussian mixture models, speaker adaptation, discriminative training, system combination and SOUL, a neural network language model. The techniques used for automatic segmentation are also clarified. Besides, we compared different types of SOUL models in order to study the impact of words of previous sentences in predicting words in language modeling. For the MT task, the baseline system was built based on the open source toolkit N-code, then being augmented by using SOUL on top, i.e., in N-best rescoring phase.