The University of Helsinki submissions to the IWSLT 2018 low-resource translation task

Yves Scherrer


Abstract
This paper presents the University of Helsinki submissions to the Basque–English low-resource translation task. Our primary system is a standard bilingual Transformer system, trained on the available parallel data and various types of synthetic data. We describe the creation of the synthetic datasets, some of which use a pivoting approach, in detail. One of our contrastive submissions is a multilingual model trained on comparable data, but without the synthesized parts. Our bilingual model with synthetic data performed best, obtaining 25.25 BLEU on the test data.
Anthology ID:
2018.iwslt-1.12
Volume:
Proceedings of the 15th International Conference on Spoken Language Translation
Month:
October 29-30
Year:
2018
Address:
Brussels
Editors:
Marco Turchi, Jan Niehues, Marcello Frederico
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
International Conference on Spoken Language Translation
Note:
Pages:
82–88
Language:
URL:
https://aclanthology.org/2018.iwslt-1.12
DOI:
Bibkey:
Cite (ACL):
Yves Scherrer. 2018. The University of Helsinki submissions to the IWSLT 2018 low-resource translation task. In Proceedings of the 15th International Conference on Spoken Language Translation, pages 82–88, Brussels. International Conference on Spoken Language Translation.
Cite (Informal):
The University of Helsinki submissions to the IWSLT 2018 low-resource translation task (Scherrer, IWSLT 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2018.iwslt-1.12.pdf
Data
OpenSubtitles