2022
pdf
abs
eTranslation’s Submissions to the WMT22 General Machine Translation Task
Csaba Oravecz
|
Katina Bontcheva
|
David Kolovratnìk
|
Bogomil Kovachev
|
Christopher Scott
Proceedings of the Seventh Conference on Machine Translation (WMT)
The paper describes the NMT models for French-German, English-Ukranian and English-Russian, submitted by the eTranslation team to the WMT22 general machine translation shared task. In the WMT news task last year, multilingual systems with deep and complex architectures utilizing immense amount of data and resources were dominant. This year with the task extended to cover less domain specific text we expected even more dominance of such systems. In the hope to produce competitive (constrained) systems despite our limited resources, this time we selected only medium resource language pairs, which are serviced in the European Commission’s eTranslation system. We took the approach of exploring less resource intensive strategies focusing on data selection and filtering to improve the performance of baseline systems. With our submitted systems our approach scored competitively according to the automatic rankings, except for the the English–Russian model where our submission was only a baseline reference model developed as a by-product of the multilingual setup we built focusing primarily on the English-Ukranian language pair.
2021
pdf
abs
eTranslation’s Submissions to the WMT 2021 News Translation Task
Csaba Oravecz
|
Katina Bontcheva
|
David Kolovratník
|
Bhavani Bhaskar
|
Michael Jellinghaus
|
Andreas Eisele
Proceedings of the Sixth Conference on Machine Translation
The paper describes the 3 NMT models submitted by the eTranslation team to the WMT 2021 news translation shared task. We developed systems in language pairs that are actively used in the European Commission’s eTranslation service. In the WMT news task, recent years have seen a steady increase in the need for computational resources to train deep and complex architectures to produce competitive systems. We took a different approach and explored alternative strategies focusing on data selection and filtering to improve the performance of baseline systems. In the domain constrained task for the French–German language pair our approach resulted in the best system by a significant margin in BLEU. For the other two systems (English–German and English-Czech) we tried to build competitive models using standard best practices.
2020
pdf
abs
eTranslation’s Submissions to the WMT 2020 News Translation Task
Csaba Oravecz
|
Katina Bontcheva
|
László Tihanyi
|
David Kolovratnik
|
Bhavani Bhaskar
|
Adrien Lardilleux
|
Szymon Klocek
|
Andreas Eisele
Proceedings of the Fifth Conference on Machine Translation
The paper describes the submissions of the eTranslation team to the WMT 2020 news translation shared task. Leveraging the experience from the team’s participation last year we developed systems for 5 language pairs with various strategies. Compared to last year, for some language pairs we dedicated a lot more resources to training, and tried to follow standard best practices to build competitive systems which can achieve good results in the rankings. By using deep and complex architectures we sacrificed direct re-usability of our systems in production environments but evaluation showed that this approach could result in better models that significantly outperform baseline architectures. We submitted two systems to the zero shot robustness task. These submissions are described briefly in this paper as well.
2019
pdf
abs
eTranslation’s Submissions to the WMT 2019 News Translation Task
Csaba Oravecz
|
Katina Bontcheva
|
Adrien Lardilleux
|
László Tihanyi
|
Andreas Eisele
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
This paper describes the submissions of the eTranslation team to the WMT 2019 news translation shared task. The systems have been developed with the aim of identifying and following rather than establishing best practices, under the constraints imposed by a low resource training and decoding environment normally used for our production systems. Thus most of the findings and results are transferable to systems used in the eTranslation service. Evaluations suggest that this approach is able to produce decent models with good performance and speed without the overhead of using prohibitively deep and complex architectures.
2013
pdf
Syncretism and How to Deal with it in a Morphological Analyzer: a German Example
Katina Bontcheva
Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing
2012
pdf
Integrating Aspectually Relevant Properties of Verbs into a Morphological Analyzer for English
Katina Bontcheva
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing
2011
pdf
FTrace: A Tool for Finite-State Morphology
James Kilbury
|
Katina Bontcheva
|
Younes Samih
Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing