Nicholas Ruiz


2017

pdf bib
Proceedings of the Workshop on Speech-Centric Natural Language Processing
Nicholas Ruiz | Srinivas Bangalore
Proceedings of the Workshop on Speech-Centric Natural Language Processing

2014

pdf bib
Assessing the impact of speech recognition errors on machine translation quality
Nicholas Ruiz | Marcello Federico
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

In spoken language translation, it is crucial that an automatic speech recognition (ASR) system produces outputs that can be adequately translated by a statistical machine translation (SMT) system. While word error rate (WER) is the standard metric of ASR quality, the assumption that each ASR error type is weighted equally is violated in a SMT system that relies on structured input. In this paper, we outline a statistical framework for analyzing the impact of specific ASR error types on translation quality in a speech translation pipeline. Our approach is based on linear mixed-effects models, which allow the analysis of ASR errors on a translation quality metric. The mixed-effects models take into account the variability of ASR systems and the difficulty of each speech utterance being translated in a specific experimental setting. We use mixed-effects models to verify that the ASR errors that compose the WER metric do not contribute equally to translation quality and that interactions exist between ASR errors that cumulatively affect a SMT system’s ability to translate an utterance. Our experiments are carried out on the English to French language pair using eight ASR systems and seven post-edited machine translation references from the IWSLT 2013 evaluation campaign. We report significant findings that demonstrate differences in the contributions of specific ASR error types toward speech translation quality and suggest further error types that may contribute to translation difficulty.

pdf bib
Complexity of spoken versus written language for machine translation
Nicholas Ruiz | Marcello Federico
Proceedings of the 17th Annual conference of the European Association for Machine Translation

2013

pdf bib
MSR-FBK IWSLT 2013 SLT system description
Anthony Aue | Qin Gao | Hany Hassan | Xiaodong He | Gang Li | Nicholas Ruiz | Frank Seide
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the systems used for the MSR+FBK submission for the SLT track of IWSLT 2013. Starting from a baseline system we made a series of iterative and additive improvements, including a novel method for processing bilingual data used to train MT systems for use on ASR output. Our primary submission is a system combination of five individual systems, combining the output of multiple ASR engines with multiple MT techniques. There are two contrastive submissions to help place the combined system in context. We describe the systems used and present results on the test sets.

pdf bib
FBK’s machine translation systems for the IWSLT 2013 evaluation campaign
Nicola Bertoldi | M. Amin Farajian | Prashant Mathur | Nicholas Ruiz | Marcello Federico
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign

This paper describes the systems submitted by FBK for the MT track of IWSLT 2013. We participated in the English-French as well as the bidirectional Persian-English translation tasks. We report substantial improvements in our English-French systems over last year’s baselines, largely due to improved techniques of combining translation and language models. For our Persian-English and English-Persian systems, we observe substantive improvements over baselines submitted by the workshop organizers, due to enhanced language-specific text normalization and the creation of a large monolingual news corpus in Persian.