Fred Bane


2021

pdf bib
System Description for Transperfect
Wiktor Stribiżew | Fred Bane | José Conceição | Anna Zaretskaya
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

In this paper, we describe our participation in the 2021 Workshop on Asian Translation (team ID: tpt_wat). We submitted results for all six directions of the JPC2 patent task. As a first-time participant in the task, we attempted to identify a single configuration that provided the best overall results across all language pairs. All our submissions were created using single base transformer models, trained on only the task-specific data, using a consistent configuration of hyperparameters. In contrast to the uniformity of our methods, our results vary widely across the six language pairs.

pdf bib
Selecting the best data filtering method for NMT training
Fred Bane | Anna Zaretskaya
Proceedings of Machine Translation Summit XVIII: Users and Providers Track

Performance of NMT systems has been proven to depend on the quality of the training data. In this paper we explore different open-source tools that can be used to score the quality of translation pairs, with the goal of obtaining clean corpora for training NMT models. We measure the performance of these tools by correlating their scores with human scores, as well as rank models trained on the resulting filtered datasets in terms of their performance on different test sets and MT performance metrics.