Alexander Allauzen


2014

pdf
Rule-based Reordering Space in Statistical Machine Translation
Nicolas Pécheux | Alexander Allauzen | François Yvon
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In Statistical Machine Translation (SMT), the constraints on word reorderings have a great impact on the set of potential translations that are explored. Notwithstanding computationnal issues, the reordering space of a SMT system needs to be designed with great care: if a larger search space is likely to yield better translations, it may also lead to more decoding errors, because of the added ambiguity and the interaction with the pruning strategy. In this paper, we study this trade-off using a state-of-the art translation system, where all reorderings are represented in a word lattice prior to decoding. This allows us to directly explore and compare different reordering spaces. We study in detail a rule-based preordering system, varying the length or number of rules, the tagset used, as well as contrasting with oracle settings and purely combinatorial subsets of permutations. We focus on two language pairs: English-French, a close language pair and English-German, known to be a more challenging reordering pair.

pdf
The KIT-LIMSI Translation System for WMT 2014
Quoc Khanh Do | Teresa Herrmann | Jan Niehues | Alexander Allauzen | François Yvon | Alex Waibel
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf
LIMSI @ WMT’14 Medical Translation Task
Nicolas Pécheux | Li Gong | Quoc Khanh Do | Benjamin Marie | Yulia Ivanishcheva | Alexander Allauzen | Thomas Lavergne | Jan Niehues | Aurélien Max | François Yvon
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf
LIMSI Submission for WMT’14 QE Task
Guillaume Wisniewski | Nicolas Pécheux | Alexander Allauzen | François Yvon
Proceedings of the Ninth Workshop on Statistical Machine Translation

pdf
Combining techniques from different NN-based language models for machine translation
Jan Niehues | Alexander Allauzen | François Yvon | Alex Waibel
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

This paper presents two improvements of language models based on Restricted Boltzmann Machine (RBM) for large machine translation tasks. In contrast to other continuous space approach, RBM based models can easily be integrated into the decoder and are able to directly learn a hidden representation of the n-gram. Previous work on RBM-based language models do not use a shared word representation and therefore, they might suffer of a lack of generalization for larger contexts. Moreover, since the training step is very time consuming, they are only used for quite small copora. In this work we add a shared word representation for the RBM-based language model by factorizing the weight matrix. In addition, we propose an efficient and tailored sampling algorithm that allows us to drastically speed up the training process. Experiments are carried out on two German to English translation tasks and the results show that the training time could be reduced by a factor of 10 without any drop in performance. Furthermore, the RBM-based model can also be trained on large size corpora.

2013

pdf
LIMSI @ WMT13
Alexander Allauzen | Nicolas Pécheux | Quoc Khanh Do | Marco Dinarelli | Thomas Lavergne | Aurélien Max | Hai-Son Le | François Yvon
Proceedings of the Eighth Workshop on Statistical Machine Translation

pdf
Joint WMT 2013 Submission of the QUAERO Project
Stephan Peitz | Saab Mansour | Matthias Huck | Markus Freitag | Hermann Ney | Eunah Cho | Teresa Herrmann | Mohammed Mediani | Jan Niehues | Alex Waibel | Alexander Allauzen | Quoc Khanh Do | Bianka Buschbeck | Tonio Wandmacher
Proceedings of the Eighth Workshop on Statistical Machine Translation