2018
pdf
abs
CUNI Submissions in WMT18
Tom Kocmi
|
Roman Sudarikov
|
Ondřej Bojar
Proceedings of the Third Conference on Machine Translation: Shared Task Papers
We participated in the WMT 2018 shared news translation task in three language pairs: English-Estonian, English-Finnish, and English-Czech. Our main focus was the low-resource language pair of Estonian and English for which we utilized Finnish parallel data in a simple method. We first train a “parent model” for the high-resource language pair followed by adaptation on the related low-resource language pair. This approach brings a substantial performance boost over the baseline system trained only on Estonian-English parallel data. Our systems are based on the Transformer architecture. For the English to Czech translation, we have evaluated our last year models of hybrid phrase-based approach and neural machine translation mainly for comparison purposes.
2017
pdf
CUNI submission in WMT17: Chimera goes neural
Roman Sudarikov
|
David Mareček
|
Tom Kocmi
|
Dušan Variš
|
Ondřej Bojar
Proceedings of the Second Conference on Machine Translation
2016
pdf
abs
UFAL Submissions to the IWSLT 2016 MT Track
Ondřej Bojar
|
Ondřej Cífka
|
Jindřich Helcl
|
Tom Kocmi
|
Roman Sudarikov
Proceedings of the 13th International Conference on Spoken Language Translation
We present our submissions to the IWSLT 2016 machine translation task, as our first attempt to translate subtitles and one of our early experiments with neural machine translation (NMT). We focus primarily on English→Czech translation direction but perform also basic adaptation experiments for NMT with German and also the reverse direction. Three MT systems are tested: (1) our Chimera, a tight combination of phrase-based MT and deep linguistic processing, (2) Neural Monkey, our implementation of a NMT system in TensorFlow and (3) Nematus, an established NMT system.
pdf
CUNI-LMU Submissions in WMT2016: Chimera Constrained and Beaten
Aleš Tamchyna
|
Roman Sudarikov
|
Ondřej Bojar
|
Alexander Fraser
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
pdf
Dictionary-based Domain Adaptation of MT Systems without Retraining
Rudolf Rosa
|
Roman Sudarikov
|
Michal Novák
|
Martin Popel
|
Ondřej Bojar
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
pdf
abs
Verb sense disambiguation in Machine Translation
Roman Sudarikov
|
Ondřej Dušek
|
Martin Holub
|
Ondřej Bojar
|
Vincent Kríž
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)
We describe experiments in Machine Translation using word sense disambiguation (WSD) information. This work focuses on WSD in verbs, based on two different approaches – verbal patterns based on corpus pattern analysis and verbal word senses from valency frames. We evaluate several options of using verb senses in the source-language sentences as an additional factor for the Moses statistical machine translation system. Our results show a statistically significant translation quality improvement in terms of the BLEU metric for the valency frames approach, but in manual evaluation, both WSD methods bring improvements.
pdf
bib
TectoMT – a deep linguistic core of the combined Cimera MT system
Martin Popel
|
Roman Sudarikov
|
Ondřej Bojar
|
Rudolf Rosa
|
Jan Hajič
Proceedings of the 19th Annual Conference of the European Association for Machine Translation: Projects/Products
2015
pdf
TeamUFAL: WSD+EL as Document Retrieval
Petr Fanta
|
Roman Sudarikov
|
Ondřej Bojar
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)