Abstract
We examine the effects of particular orderings of sentence pairs on the on-line training of neural machine translation (NMT). We focus on two types of such orderings: (1) ensuring that each minibatch contains sentences similar in some aspect and (2) gradual inclusion of some sentence types as the training progresses (so called “curriculum learning”). In our English-to-Czech experiments, the internal homogeneity of minibatches has no effect on the training but some of our “curricula” achieve a small improvement over the baseline.- Anthology ID:
- R17-1050
- Volume:
- Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 379–386
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-049-6_050
- DOI:
- 10.26615/978-954-452-049-6_050
- Cite (ACL):
- Tom Kocmi and Ondřej Bojar. 2017. Curriculum Learning and Minibatch Bucketing in Neural Machine Translation. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 379–386, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Curriculum Learning and Minibatch Bucketing in Neural Machine Translation (Kocmi & Bojar, RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-049-6_050