Rihards Krišlauks

Also published as: Rihards Krislauks


2021

The Neural Translation for the European Union (NTEU) engine farm enables direct machine translation for all 24 official languages of the European Union without the necessity to use a high-resourced language as a pivot. This amounts to a total of 552 translation engines for all combinations of the 24 languages. We have collected parallel data for all the language combinations publickly shared in elrc-share.eu. The translation engines have been customized to domain,for the use of the European public administrations. The delivered engines will be published in the European Language Grid. In addition to the usual automatic metrics, all the engines have been evaluated by humans based on the direct assessment methodology. For this purpose, we built an open-source platform called MTET The evaluation shows that most of the engines reach high quality and get better scores compared to an external machine translation service in a blind evaluation setup.

2020

This paper describes Tilde’s submission to the WMT2020 shared task on news translation for both directions of the English-Polish language pair in both the constrained and the unconstrained tracks. We follow our submissions form the previous years and build our baseline systems to be morphologically motivated sub-word unit-based Transformer base models that we train using the Marian machine translation toolkit. Additionally, we experiment with different parallel and monolingual data selection schemes, as well as sampled back-translation. Our final models are ensembles of Transformer base and Transformer big models which feature right-to-left re-ranking.

2019

The paper describes the development process of Tilde’s NMT systems for the WMT 2019 shared task on news translation. We trained systems for the English-Lithuanian and Lithuanian-English translation directions in constrained and unconstrained tracks. We build upon the best methods of the previous year’s competition and combine them with recent advancements in the field. We also present a new method to ensure source domain adherence in back-translated data. Our systems achieved a shared first place in human evaluation.

2018

This paper describes the submissions to the efficiency track for GPUs at the Workshop for Neural Machine Translation and Generation by members of the University of Edinburgh, Adam Mickiewicz University, Tilde and University of Alicante. We focus on efficient implementation of the recurrent deep-learning model as implemented in Amun, the fast inference engine for neural machine translation. We improve the performance with an efficient mini-batching algorithm, and by fusing the softmax operation with the k-best extraction algorithm. Submissions using Amun were first, second and third fastest in the GPU efficiency track.
The paper describes the development process of the Tilde’s NMT systems that were submitted for the WMT 2018 shared task on news translation. We describe the data filtering and pre-processing workflows, the NMT system training architectures, and automatic evaluation results. For the WMT 2018 shared task, we submitted seven systems (both constrained and unconstrained) for English-Estonian and Estonian-English translation directions. The submitted systems were trained using Transformer models.

2017