A Hyperparameter Optimization Toolkit for Neural Machine Translation Research

Xuan Zhang, Kevin Duh, Paul McNamee


Abstract
Hyperparameter optimization is an important but often overlooked process in the research of deep learning technologies. To obtain a good model, one must carefully tune hyperparameters that determine the architecture and training algorithm. Insufficient tuning may result in poor results, while inequitable tuning may lead to exaggerated differences between models. We present a hyperparameter optimization toolkit for neural machine translation (NMT) to help researchers focus their time on the creative rather than the mundane. The toolkit is implemented as a wrapper on top of the open-source Sockeye NMT software. Using the Asynchronous Successive Halving Algorithm (ASHA), we demonstrate that it is possible to discover near-optimal models under a computational budget with little effort.Code: https://github.com/kevinduh/sockeye-recipes3Video demo: https://cs.jhu.edu/ kevinduh/j/demo.mp4
Anthology ID:
2023.acl-demo.15
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Month:
July
Year:
2023
Address:
Toronto, Canada
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
161–168
Language:
URL:
https://aclanthology.org/2023.acl-demo.15
DOI:
Bibkey:
Cite (ACL):
Xuan Zhang, Kevin Duh, and Paul McNamee. 2023. A Hyperparameter Optimization Toolkit for Neural Machine Translation Research. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 161–168, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
A Hyperparameter Optimization Toolkit for Neural Machine Translation Research (Zhang et al., ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nodalida-main-page/2023.acl-demo.15.pdf