Abstract
This paper describes the University of Edinburgh’s neural machine translation systems submitted to the IWSLT 2020 open domain Japanese↔Chinese translation task. On top of commonplace techniques like tokenisation and corpus cleaning, we explore character mapping and unsupervised decoding-time adaptation. Our techniques focus on leveraging the provided data, and we show the positive impact of each technique through the gradual improvement of BLEU.- Anthology ID:
- 2020.iwslt-1.14
- Volume:
- Proceedings of the 17th International Conference on Spoken Language Translation
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, Francois Yvon
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 122–129
- Language:
- URL:
- https://aclanthology.org/2020.iwslt-1.14
- DOI:
- 10.18653/v1/2020.iwslt-1.14
- Cite (ACL):
- Pinzhen Chen, Nikolay Bogoychev, and Ulrich Germann. 2020. Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System. In Proceedings of the 17th International Conference on Spoken Language Translation, pages 122–129, Online. Association for Computational Linguistics.
- Cite (Informal):
- Character Mapping and Ad-hoc Adaptation: Edinburgh’s IWSLT 2020 Open Domain Translation System (Chen et al., IWSLT 2020)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2020.iwslt-1.14.pdf
- Code
- marian-nmt/marian