Fabien Cromieres

Also published as: Fabien Cromières


2020

pdf
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation
Zhuoyuan Mao | Fabien Cromieres | Raj Dabre | Haiyue Song | Sadao Kurohashi
Proceedings of the Twelfth Language Resources and Evaluation Conference

Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Sequence) are extremely effective in boosting NMT quality for languages with small parallel corpora. However, they do not account for linguistic information obtained using syntactic analyzers which is known to be invaluable for several Natural Language Processing (NLP) tasks. To this end, we propose JASS, Japanese-specific Sequence to Sequence, as a novel pre-training alternative to MASS for NMT involving Japanese as the source or target language. JASS is joint BMASS (Bunsetsu MASS) and BRSS (Bunsetsu Reordering Sequence to Sequence) pre-training which focuses on Japanese linguistic units called bunsetsus. In our experiments on ASPEC Japanese–English and News Commentary Japanese–Russian translation we show that JASS can give results that are competitive with if not better than those given by MASS. Furthermore, we show for the first time that joint MASS and JASS pre-training gives results that significantly surpass the individual methods indicating their complementary nature. We will release our code, pre-trained models and bunsetsu annotated data as resources for researchers to use in their own NLP tasks.

2019

pdf
Kyoto University Participation to the WMT 2019 News Shared Task
Fabien Cromieres | Sadao Kurohashi
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe here the experiments we did for the the news translation shared task of WMT 2019. We focused on the new German-to-French language direction, and mostly used current standard approaches to develop a Neural Machine Translation system. We make use of the Tensor2Tensor implementation of the Transformer model. After carefully cleaning the data and noting the importance of the good use of recent monolingual data for the task, we obtain our final result by combining the output of a diverse set of trained models through the use of their “checkpoint agreement”.

2017

pdf
Kyoto University Participation to WAT 2017
Fabien Cromieres | Raj Dabre | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

We describe here our approaches and results on the WAT 2017 shared translation tasks. Following our good results with Neural Machine Translation in the previous shared task, we continue this approach this year, with incremental improvements in models and training methods. We focused on the ASPEC dataset and could improve the state-of-the-art results for Chinese-to-Japanese and Japanese-to-Chinese translations.

pdf
Kyoto University MT System Description for IWSLT 2017
Raj Dabre | Fabien Cromieres | Sadao Kurohashi
Proceedings of the 14th International Conference on Spoken Language Translation

We describe here our Machine Translation (MT) model and the results we obtained for the IWSLT 2017 Multilingual Shared Task. Motivated by Zero Shot NMT [1] we trained a Multilingual Neural Machine Translation by combining all the training data into one single collection by appending the tokens to the source sentences in order to indicate the target language they should be translated to. We observed that even in a low resource situation we were able to get translations whose quality surpass the quality of those obtained by Phrase Based Statistical Machine Translation by several BLEU points. The most surprising result we obtained was in the zero shot setting for Dutch-German and Italian-Romanian where we observed that despite using no parallel corpora between these language pairs, the NMT model was able to translate between these languages and the translations were either as good as or better (in terms of BLEU) than the non zero resource setting. We also verify that the NMT models that use feed forward layers and self attention instead of recurrent layers are extremely fast in terms of training which is useful in a NMT experimental setting.

pdf
Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages
Raj Dabre | Fabien Cromieres | Sadao Kurohashi
Proceedings of Machine Translation Summit XVI: Research Track

pdf
Neural Machine Translation: Basics, Practical Aspects and Recent Trends
Fabien Cromieres | Toshiaki Nakazawa | Raj Dabre
Proceedings of the IJCNLP 2017, Tutorial Abstracts

Machine Translation (MT) is a sub-field of NLP which has experienced a number of paradigm shifts since its inception. Up until 2014, Phrase Based Statistical Machine Translation (PBSMT) approaches used to be the state of the art. In late 2014, Neural Machine Translation (NMT) was introduced and was proven to outperform all PBSMT approaches by a significant margin. Since then, the NMT approaches have undergone several transformations which have pushed the state of the art even further. This tutorial is primarily aimed at researchers who are either interested in or are fairly new to the world of NMT and want to obtain a deep understanding of NMT fundamentals. Because it will also cover the latest developments in NMT, it should also be useful to attendees with some experience in NMT.

2016

pdf bib
Cross-language Projection of Dependency Trees with Constrained Partial Parsing for Tree-to-Tree Machine Translation
Yu Shen | Chenhui Chu | Fabien Cromieres | Sadao Kurohashi
Proceedings of the First Conference on Machine Translation: Volume 1, Research Papers

pdf
The Kyoto University Cross-Lingual Pronoun Translation System
Raj Dabre | Yevgeniy Puzikov | Fabien Cromieres | Sadao Kurohashi
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

pdf
Kyoto University Participation to WAT 2016
Fabien Cromieres | Chenhui Chu | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

We describe here our approaches and results on the WAT 2016 shared translation tasks. We tried to use both an example-based machine translation (MT) system and a neural MT system. We report very good translation results, especially when using neural MT for Chinese-to-Japanese translation.

pdf bib
Flexible Non-Terminals for Dependency Tree-to-Tree Reordering
John Richardson | Fabien Cromières | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Kyoto-NMT: a Neural Machine Translation implementation in Chainer
Fabien Cromières
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

We present Kyoto-NMT, an open-source implementation of the Neural Machine Translation paradigm. This implementation is done in Python and Chainer, an easy-to-use Deep Learning Framework.

2015

pdf
KyotoEBMT System Description for the 2nd Workshop on Asian Translation
John Richardson | Raj Dabre | Chenhui Chu | Fabien Cromières | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)

pdf
Large-scale Dictionary Construction via Pivot-based Statistical Machine Translation with Significance Pruning and Neural Network Features
Raj Dabre | Chenhui Chu | Fabien Cromieres | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

pdf
Cross-language Projection of Dependency Trees for Tree-to-tree Machine Translation
Yu Shen | Chenhui Chu | Fabien Cromieres | Sadao Kurohashi
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation: Posters

pdf
Leveraging Small Multilingual Corpora for SMT Using Many Pivot Languages
Raj Dabre | Fabien Cromieres | Sadao Kurohashi | Pushpak Bhattacharyya
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf
Translation Rules with Right-Hand Side Lattices
Fabien Cromières | Sadao Kurohashi
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf
KyotoEBMT: An Example-Based Dependency-to-Dependency Translation Framework
John Richardson | Fabien Cromières | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf
KyotoEBMT System Description for the 1st Workshop on Asian Translation
John Richardson | Fabien Cromières | Toshiaki Nakazawa | Sadao Kurohashi
Proceedings of the 1st Workshop on Asian Translation (WAT2014)

2012

pdf
Constrained Hidden Markov Model for Bilingual Keyword Pairs Alignment
Denny Cahyadi | Fabien Cromieres | Sadao Kurohashi
Proceedings of the 10th Workshop on Asian Language Resources

2011

pdf
Efficient retrieval of tree translation examples for Syntax-Based Machine Translation
Fabien Cromieres | Sadao Kurohashi
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2009

pdf
An Alignment Algorithm Using Belief Propagation and a Structure-Based Distortion Model
Fabien Cromières | Sadao Kurohashi
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

2006

pdf
Sub-Sentential Alignment Using Substring Co-Occurrence Counts
Fabien Cromieres
Proceedings of the COLING/ACL 2006 Student Research Workshop