International Conference on Spoken Language Translation (2007)


bib (full) Proceedings of the Fourth International Workshop on Spoken Language Translation

pdf bib
Overview of the IWSLT 2007 evaluation campaign
Cameron Shaw Fordyce

In this paper we give an overview of the 2007 evaluation campaign for the International Workshop on Spoken Language Translation (IWSLT)1. As with previous evaluation campaigns, the primary focus of the workshop was the translation of spoken language in the travel domain. This year there were four language pairs; the translation of Chinese, Italian, Arabic, and Japanese into English. The input data consisted of the output of ASR systems for read speech and clean text. The exceptions were the challenge task of the Italian English language pair which used spontaneous speech ASR outputs and transcriptions and the Chinese English task which used only clean text. A new characteristic of this year’s evaluation campaign was an increased focus on the sharing of resources. Participants were requested to submit the data and supplementary resources used in building their systems so that the other participants might be able to take advantage of the same resources. A second new characteristic this year was the focus on the human evaluation of systems. Each primary run was judged in the human evaluation for every task using a straightforward ranking of systems. This year's workshop saw an increased participation over last year's workshop. This year 24 groups submitted runs to one or more of the tasks, compared to the 19 groups that submitted runs last year [1]. Automatic and human evaluation were carried out to measure MT performance under each condition, ASR system outputs for read speech, spontaneous travel dialogues, and clean text.

pdf bib
A comparison of linguistically and statistically enhanced models for speech-to-speech machine translation
Alicia Pérez | Víctor Guijarrubia | Raquel Justo | M. Inés Torres | Francisco Casacuberta

The goal of this work is to improve current translation models by taking into account additional knowledge sources such as semantically motivated segmentation or statistical categorization. Specifically, two different approaches are discussed. On the one hand, phrase-based approach, and on the other hand, categorization. For both approaches, both statistical and linguistic alternatives are explored. As for translation framework, finite-state transducers are considered. These are versatile models that can be easily integrated on-the-fly with acoustic models for speech translation purposes. In what the experimental framework concerns, all the models presented were evaluated and compared taking confidence intervals into account.

pdf bib
Improved chunk-level reordering for statistical machine translation
Yuqi Zhang | Richard Zens | Hermann Ney

Inspired by previous chunk-level reordering approaches to statistical machine translation, this paper presents two methods to improve the reordering at the chunk level. By introducing a new lattice weighting factor and by reordering the training source data, an improvement is reported on TER and BLEU. Compared to the previous chunklevel reordering approach, the BLEU score improves 1.4% absolutely. The translation results are reported on IWSLT Chinese-English task.

The CMU TransTac 2007 eyes-free two-way speech-to-speech translation system
Nguyen Bach | Matthais Eck | Paisarn Charoenpornsawat | Thilo Köhler | Sebastian Stüker | ThuyLinh Nguyen | Roger Hsiao | Alex Waibel | Stephan Vogel | Tanja Schultz | Alan W. Black

The paper describes our portable two-way speech-to-speech translation system using a completely eyes-free/hands-free user interface. This system translates between the language pair English and Iraqi Arabic as well as between English and Farsi, and was built within the framework of the DARPA TransTac program. The Farsi language support was developed within a 90-day period, testing our ability to rapidly support new languages. The paper gives an overview of the system’s components along with the individual component objective measures and a discussion of issues relevant for the overall usage of the system. We found that usability, flexibility, and robustness serve as severe constraints on system architecture and design.

CASIA phrase-based SMT system for IWSLT’07
Yu Zhou | Yanqing He | Chengqing Zong

The University of Edinburgh system description for IWSLT 2007
Josh Schroeder | Philipp Koehn

We present the University of Edinburgh’s submission for the IWSLT 2007 shared task. Our efforts focused on adapting our statistical machine translation system to the open data conditions for the Italian-English task of the evaluation campaign. We examine the challenges of building a system with a limited set of in-domain development data (SITAL), a small training corpus in a related but distinct domain (BTEC), and a large out of domain corpus (Europarl). We concentrated on the corrected text track, and present additional results of our experiments using the open-source Moses MT system with speech input.

The GREYC machine translation system for the IWSLT 2007 evaluation campaign
Yves Lepage | Adrien Lardilleux

The GREYC machine translation (MT) system is a slight evolution of the ALEPH machine translation system that participated in the IWLST 2005 campaign. It is a pure example-based MT system that exploits proportional analogies. The training data used for this campaign were limited on purpose to the sole data provided by the organizers. However, the training data were expanded with the results of sub-sentential alignments. Thesystemparticipatedinthetwoclassicaltasks of translation of manually transcribed texts from Japanese to English and Arabic to English.

I2R Chinese-English translation system for IWSLT 2007
Boxing Chen | Jun Sun | Hongfei Jiang | Min Zhang | Ai Ti Aw

In this paper, we describe the system and approach used by Institute for Infocomm Research (I2R) for the IWSLT 2007 spoken language evaluation campaign. A multi-pass approach is exploited to generate and select best translation. First, we use two decoders namely the open source Moses and an in-home syntax-based decoder to generate N-best lists. Next we spawn new translation entries through a word-based n-gram language model estimated on the former N-best entries. Finally, we join the N-best lists from the previous two passes, and select the best translation by rescoring them with additional feature functions. In particular, this paper reports our effort on new translation entry generation and system combination. The performance on development and test sets are reported. The system was ranked first with respect to the BLEU measure in Chinese-to-English open data track.

The CMU-UKA statistical machine translation systems for IWSLT 2007
Ian Lane | Andreas Zollmann | Thuy Linh Nguyen | Nguyen Bach | Ashish Venugopal | Stephan Vogel | Kay Rottmann | Ying Zhang | Alex Waibel

This paper describes the CMU-UKA statistical machine translation systems submitted to the IWSLT 2007 evaluation campaign. Systems were submitted for three language-pairs: Japanese→English, Chinese→English and Arabic→English. All systems were based on a common phrase-based SMT (statistical machine translation) framework but for each language-pair a specific research problem was tackled. For Japanese→English we focused on two problems: first, punctuation recovery, and second, how to incorporate topic-knowledge into the translation framework. Our Chinese→English submission focused on syntax-augmented SMT and for the Arabic→English task we focused on incorporating morphological-decomposition into the SMT framework. This research strategy enabled us to evaluate a wide variety of approaches which proved effective for the language pairs they were evaluated on.

MaTrEx: the DCU machine translation system for IWSLT 2007
Hany Hassan | Yanjun Ma | Andy Way

In this paper, we give a description of the machine translation system developed at DCU that was used for our second participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2007). In this participation, we focus on some new methods to improve system quality. Specifically, we try our word packing technique for different language pairs, we smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the high number of out of vocabulary items, and finally we deploy a translation-based model for case and punctuation restoration. We participated in both the classical and challenge tasks for the following translation directions: Chinese–English, Japanese–English and Arabic–English. For the last two tasks, we translated both the single-best ASR hypotheses and the correct recognition results; for Chinese–English, we just translated the correct recognition results. We report the results of the system for the provided evaluation sets, together with some additional experiments carried out following identification of some simple tokenisation errors in the official runs.

Nicola Bertoldi | Mauro Cettolo | Roldano Cattoni | Marcello Federico

This paper reports on the participation of FBK (formerly ITC-irst) at the IWSLT 2007 Evaluation. FBK participated in three tasks, namely Chinese-to-English, Japanese-to-English, and Italian-to-English. With respect to last year, translation systems were developed with the Moses Toolkit and the IRSTLM library, both available as open source software. Moreover, several novel ideas were investigated: the use of confusion networks in input to manage ambiguity in punctuation, the estimation of an additional language model by means of the Google’s Web 1T 5-gram collection, the combination of true case and lower case language models, and finally the use of multiple phrase-tables. By working on top of a state-of-the art baseline, experiments showed that the above methods accounted for significant BLEU score improvements.

HKUST statistical machine translation experiments for IWSLT 2007
Yihai Shen | Chi-kiu Lo | Marine Carpuat | Dekai Wu

This paper describes the HKUST experiments in the IWSLT 2007 evaluation campaign on spoken language translation. Our primary objective was to compare the open-source phrase-based statistical machine translation toolkit Moses against Pharaoh. We focused on Chinese to English translation, but we also report results on the Arabic to English, Italian to English, and Japanese to English tasks.

The University of Washington machine translation system for the IWSLT 2007 competition
Katrin Kirchhoff | Mei Yang

This paper presents the University of Washington’s submission to the 2007 IWSLT benchmark evaluation. The UW system participated in two data tracks, Italian-to-English and Arabic-to-English. Our main focus was on incorporating out-of-domain data, which contributed to improvements for both language pairs in both the clean text and ASR output conditions. In addition, we compared supervised and semi-supervised preprocessing schemes for the Arabic-to-English task and found that the semi-supervised scheme performs competitively with the supervised algorithm while using a fraction of the run-time.

The MIT-LL/AFRL IWSLT-2007 MT system
Wade Shen | Brian Delaney | Tim Anderson | Ray Slyh

The MIT-LL/AFRL MT system implements a standard phrase-based, statistical translation model. It incorporates a number of extensions that improve performance for speech-based translation. During this evaluation our efforts focused on the rapid porting of our SMT system to a new language (Arabic) and novel approaches to translation from speech input. This paper discusses the architecture of the MIT-LL/AFRL MT system, improvements over our 2006 system, and experiments we ran during the IWSLT-2007 evaluation. Specifically, we focus on 1) experiments comparing the performance of confusion network decoding and direct lattice decoding techniques for machine translation of speech, 2) the application of lightweight morphology for Arabic MT preprocessing and 3) improved confusion network decoding.

The NICT/ATR speech translation system for IWSLT 2007
Andrew Finch | Etienne Denoual | Hideo Okuma | Michael Paul | Hirofumi Yamamoto | Keiji Yasuda | Ruiqiang Zhang | Eiichiro Sumita

This paper describes the NiCT-ATR statistical machine translation (SMT) system used for the IWSLT 2007 evaluation campaign. We participated in three of the four language pair translation tasks (CE, JE, and IE). We used a phrase-based SMT system using log-linear feature models for all tracks. This year we decoded from the ASR n-best lists in the JE track and found a gain in performance. We also applied some new techniques to facilitate the use of out-of-domain external resources by model combination and also by utilizing a huge corpus of n-grams provided by Google Inc.. Using these resources gave mixed results that depended on the technique also the language pair however, in some cases we achieved consistently positive results. The results from model-interpolation in particular were very promising.

Larger feature set approach for machine translation in IWSLT 2007
Taro Watanabe | Jun Suzuki | Katsuhito Sudoh | Hajime Tsukada | Hideki Isozaki

The NTT Statistical Machine Translation System employs a large number of feature functions. First, k-best translation candidates are generated by an efficient decoding method of hierarchical phrase-based translation. Second, the k-best translations are reranked. In both steps, sparse binary features — of the order of millions — are integrated during the search. This paper gives the details of the two steps and shows the results for the Evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007.

The ICT statistical machine translation systems for IWSLT 2007
Zhongjun He | Haitao Mi | Yang Liu | Deyi Xiong | Weihua Luo | Yun Huang | Zhixiang Ren | Yajuan Lu | Qun Liu

In this paper, we give an overview of the ICT statistical machine translation systems for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007. In this year’s evaluation, we participated in the Chinese-English transcript translation task, and developed three systems based on different techniques: a formally syntax-based system Bruin, an extended phrase-based system Confucius and a linguistically syntax-based system Lynx. We will describe the models of these three systems, and compare their performance in detail. We set Bruin as our primary system, which ranks 2 among the 15 primary results according to the official evaluation results.

João V. Graça | Diamantino Caseiro | Luísa Coheur

We present the machine translation system used by L2F from INESC-ID in the evaluation campaign of the International Workshop on Spoken Language Translation (2007), in the task of translating spontaneous conversations in the travel domain from Italian to English.

Using word posterior probabilities in lattice translation
Vicente Alabau | Alberto Sanchis | Francisco Casacuberta

In this paper we describe the statistical machine translation system developed at ITI/UPV, which aims especially at speech recognition and statistical machine translation integration, for the evaluation campaign of the International Workshop on Spoken Language Translation (2007). The system we have developed takes advantage of an improved word lattice representation that uses word posterior probabilities. These word posterior probabilities are then added as a feature to a log-linear model. This model includes a stochastic finite-state transducer which allows an easy lattice integration. Furthermore, it provides a statistical phrase-based reordering model that is able to perform local reorderings of the output. We have tested this model on the Italian-English corpus, for clean text, 1-best ASR and lattice ASR inputs. The results and conclusions of such experiments are reported at the end of this paper.

The LIG Arabic/English speech translation system at IWSLT07
Laurent Besacier | Amar Mahdhaoui | Viet-Bac Le

This paper is a description of the system presented by the LIG laboratory to the IWSLT07 speech translation evaluation. The LIG participated, for the first time this year, in the Arabic to English speech translation task. For translation, we used a conventional statistical phrase-based system developed using the moses open source decoder. Our baseline MT system is described and we discuss particularly the use of an additional bilingual dictionary which seems useful when few training data is available. The main contribution of this paper concerns the proposal of a lattice decomposition algorithm that allows transforming a word lattice into a sub word lattice compatible with our MT model that uses word segmentation on the Arabic part. The lattice is then transformed into a confusion network which can be directly decoded into moses. The results show that this method outperforms the conventional 1-best translation which consists in translating only the most probable ASR hypothesis. The best BLEU score, from ASR output obtained on IWSLT06 evaluation data is 0.2253. The results confirm the interest of full CN decoding for speech translation, compared to traditional ASR 1-best approach. Our primary system was ranked 7/14 for IWSLT07 AE ASR task with a BLEU score of 0.3804.

NUDT machine translation system for IWSLT2007
Wen-Han Chao | Zhou-Jun Li

In this paper, we describe our machine translation system which was used for the Chinese-to-English task in the IWSLT2007 evaluation campaign. The system is a statistical machine translation (SMT) system, while containing an example-based decoder. In this way, it will help to solve the re-ordering problem and other problems for spoken language MT, such as lots of omissions, idioms etc. We report the results of the system for the provided evaluation sets.

MISTRAL: a lattice translation system for IWSLT 2007
Alexandre Patry | Philippe Langlais | Frédéric Béchet

This paper describes MISTRAL, the lattice translation system that we developed for the Italian-English track of the International Workshop on Spoken Language Translation 2007. MISTRAL is a discriminative phrase-based system that translates a source word lattice in two passes. The first pass extracts a list of top ranked sentence pairs from the lattice and the second pass rescores this list with more complex features. Our experiments show that our system, when translating pruned lattices, is at least as good as a fair baseline that translates the first ranked sentences returned by a speech recognition system.

Statistical machine translation using large J/E parallel corpus and long phrase tables
Jin’ichi Murakami | Masato Tokuhisa | Satoru Ikehara

Our statistical machine translation system that uses large Japanese-English parallel sentences and long phrase tables is described. We collected 698,973 Japanese-English parallel sentences, and we used long phrase tables. Also, we utilized general tools for statistical machine translation, such as ”Giza++”[1], ”moses”[2], and ”training-phrasemodel.perl”[3]. We used these data and these tools, We challenge the contest for IWSLT07. In which task was the result (0.4321 BLEU) obtained.

The XMU SMT system for IWSLT 2007
Yidong Chen | Xiaodong Shi | Changle Zhou

In this paper, an overview of the XMU statistical machine translation (SMT) system for the 2007 IWSLT Speech Translation Evaluation is given. Our system is a phrase-based system with a reordering model based on chunking and reordering of source language. In this year’s evaluation, we participated in the open data track for Clean Transcripts for the Chinese-English translation direction. The system ranked the 12th among the 15 participating systems.

The RWTH machine translation system for IWSLT 2007
Arne Mauser | David Vilar | Gregor Leusch | Yuqi Zhang | Hermann Ney

The RWTH system for the IWSLT 2007 evaluation is a combination of several statistical machine translation systems. The combination includes Phrase-Based models, a n-gram translation model and a hierarchical phrase model. We describe the individual systems and the method that was used for combining the system outputs. Compared to our 2006 system, we newly introduce a hierarchical phrase-based translation model and show improvements in system combination for Machine Translation. RWTH participated in the Italian-to-English and Chinese-to-English translation directions.

The TALP ngram-based SMT system for IWSLT 2007
Patrik Lambert | Marta R. Costa-jussà | Josep M. Crego | Maxim Khalilov | José B. Mariño | Rafael E. Banchs | José A. R. Fonollosa | Holger Schwenk

This paper describes TALPtuples, the 2007 N-gram-based statistical machine translation system developed at the TALP Research Center of the UPC (Universitat Polite`cnica de Catalunya) in Barcelona. Emphasis is put on improvements and extensions of the system of previous years. Mainly, these include optimizing alignment parameters in function of translation metric scores and rescoring with a neural network language model. Results on two translation directions are reported, namely from Arabic and Chinese into English, thoroughly explaining all language-related preprocessing and translation schemes.

The TÜBÍTAK-UEKAE statistical machine translation system for IWSLT 2007
Coşkun Mermer | Hamza Kaya | Mehmet Uğur Doğan

We describe the TÜBITAK-UEKAE system that participated in the Arabic-to-English and Japanese-to-English translation tasks of the IWSLT 2007 evaluation campaign. Our system is built on the open-source phrase-based statistical machine translation software Moses. Among available corpora and linguistic resources, only the supplied training data and an Arabic morphological analyzer are used in the system. We present the run-time lexical approximation method to cope with out-of-vocabulary words during decoding. We tested our system under both automatic speech recognition (ASR) and clean transcript (clean) input conditions. Our system was ranked first in both Arabic-to-English and Japanese-to-English tasks under the “clean” condition.

The University of Maryland translation system for IWSLT 2007
Christopher J. Dyer

This paper describes the University of Maryland statistical machine translation system used in the IWSLT 2007 evaluation. Our focus was threefold: using hierarchical phrase-based models in spoken language translation, the incorporation of sub-lexical information in model estimation via morphological analysis (Arabic) and word and character segmentation (Chinese), and the use of n-gram sequence models for source-side punctuation prediction. Our efforts yield significant improvements in Chinese-English and Arabic-English translation tasks for both spoken language and human transcription conditions.