Abstract
We present in this report our submission to IWSLT 2020 Open Domain Translation Task. We built a data pre-processing pipeline to efficiently handle large noisy web-crawled corpora, which boosts the BLEU score of a widely used transformer model in this translation task. To tackle the open-domain nature of this task, back- translation is applied to further improve the translation performance.- Anthology ID:
- 2020.iwslt-1.16
- Volume:
- Proceedings of the 17th International Conference on Spoken Language Translation
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 140–144
- Language:
- URL:
- https://aclanthology.org/2020.iwslt-1.16
- DOI:
- 10.18653/v1/2020.iwslt-1.16
- Cite (ACL):
- Enmin Su and Yi Ren. 2020. Deep Blue Sonics’ Submission to IWSLT 2020 Open Domain Translation Task. In Proceedings of the 17th International Conference on Spoken Language Translation, pages 140–144, Online. Association for Computational Linguistics.
- Cite (Informal):
- Deep Blue Sonics’ Submission to IWSLT 2020 Open Domain Translation Task (Su & Ren, IWSLT 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.iwslt-1.16.pdf