Abstract
This paper presents experiments comparing character-based and byte-based neural machine translation systems. The main motivation of the byte-based neural machine translation system is to build multi-lingual neural machine translation systems that can share the same vocabulary. We compare the performance of both systems in several language pairs and we see that the performance in test is similar for most language pairs while the training time is slightly reduced in the case of byte-based neural machine translation.- Anthology ID:
- W17-4123
- Volume:
- Proceedings of the First Workshop on Subword and Character Level Models in NLP
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Venue:
- SCLeM
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 154–158
- Language:
- URL:
- https://aclanthology.org/W17-4123
- DOI:
- 10.18653/v1/W17-4123
- Cite (ACL):
- Marta R. Costa-jussà, Carlos Escolano, and José A. R. Fonollosa. 2017. Byte-based Neural Machine Translation. In Proceedings of the First Workshop on Subword and Character Level Models in NLP, pages 154–158, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Byte-based Neural Machine Translation (Costa-jussà et al., SCLeM 2017)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W17-4123.pdf