The 2016 KIT IWSLT Speech-to-Text Systems for English and German

Thai-Son Nguyen, Markus Müller, Matthias Sperber, Thomas Zenkel, Kevin Kilgour, Sebastian Stüker, Alex Waibel


Abstract
This paper describes our German and English Speech-to-Text (STT) systems for the 2016 IWSLT evaluation campaign. The campaign focuses on the transcription of unsegmented TED talks. Our setup includes systems using both the Janus and Kaldi frameworks. We combined the outputs using both ROVER [1] and confusion network combination (CNC) [2] to archieve a good overall performance. The individual subsystems are built by using different speaker-adaptive feature combination (e.g., lMEL with i-vector or bottleneck speaker vector), acoustic models (GMM or DNN) and speaker adaption (MLLR or fMLLR). Decoding is performed in two stages, where the GMM and DNN systems are adapted on the combination of the first stage outputs using MLLR, and fMLLR. The combination setup produces a final hypothesis that has a significantly lower WER than any of the individual subsystems. For the English TED task, our best combination system has a WER of 7.8% on the development set while our other combinations gained 21.8% and 28.7% WERs for the English and German MSLT tasks.
Anthology ID:
2016.iwslt-1.23
Volume:
Proceedings of the 13th International Conference on Spoken Language Translation
Month:
December 8-9
Year:
2016
Address:
Seattle, Washington D.C
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
International Workshop on Spoken Language Translation
Note:
Pages:
Language:
URL:
https://aclanthology.org/2016.iwslt-1.23
DOI:
Bibkey:
Cite (ACL):
Thai-Son Nguyen, Markus Müller, Matthias Sperber, Thomas Zenkel, Kevin Kilgour, Sebastian Stüker, and Alex Waibel. 2016. The 2016 KIT IWSLT Speech-to-Text Systems for English and German. In Proceedings of the 13th International Conference on Spoken Language Translation, Seattle, Washington D.C. International Workshop on Spoken Language Translation.
Cite (Informal):
The 2016 KIT IWSLT Speech-to-Text Systems for English and German (Nguyen et al., IWSLT 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2016.iwslt-1.23.pdf