Abstract
This paper presents the University of Helsinki submissions to the Basque–English low-resource translation task. Our primary system is a standard bilingual Transformer system, trained on the available parallel data and various types of synthetic data. We describe the creation of the synthetic datasets, some of which use a pivoting approach, in detail. One of our contrastive submissions is a multilingual model trained on comparable data, but without the synthesized parts. Our bilingual model with synthetic data performed best, obtaining 25.25 BLEU on the test data.- Anthology ID:
- 2018.iwslt-1.12
- Volume:
- Proceedings of the 15th International Conference on Spoken Language Translation
- Month:
- October 29-30
- Year:
- 2018
- Address:
- Brussels
- Editors:
- Marco Turchi, Jan Niehues, Marcello Frederico
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- International Conference on Spoken Language Translation
- Note:
- Pages:
- 82–88
- Language:
- URL:
- https://aclanthology.org/2018.iwslt-1.12
- DOI:
- Cite (ACL):
- Yves Scherrer. 2018. The University of Helsinki submissions to the IWSLT 2018 low-resource translation task. In Proceedings of the 15th International Conference on Spoken Language Translation, pages 82–88, Brussels. International Conference on Spoken Language Translation.
- Cite (Informal):
- The University of Helsinki submissions to the IWSLT 2018 low-resource translation task (Scherrer, IWSLT 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2018.iwslt-1.12.pdf
- Data
- OpenSubtitles