Abstract
Hyperparameter optimization is an important but often overlooked process in the research of deep learning technologies. To obtain a good model, one must carefully tune hyperparameters that determine the architecture and training algorithm. Insufficient tuning may result in poor results, while inequitable tuning may lead to exaggerated differences between models. We present a hyperparameter optimization toolkit for neural machine translation (NMT) to help researchers focus their time on the creative rather than the mundane. The toolkit is implemented as a wrapper on top of the open-source Sockeye NMT software. Using the Asynchronous Successive Halving Algorithm (ASHA), we demonstrate that it is possible to discover near-optimal models under a computational budget with little effort.Code: https://github.com/kevinduh/sockeye-recipes3Video demo: https://cs.jhu.edu/ kevinduh/j/demo.mp4- Anthology ID:
- 2023.acl-demo.15
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 161–168
- Language:
- URL:
- https://aclanthology.org/2023.acl-demo.15
- DOI:
- Cite (ACL):
- Xuan Zhang, Kevin Duh, and Paul McNamee. 2023. A Hyperparameter Optimization Toolkit for Neural Machine Translation Research. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 161–168, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- A Hyperparameter Optimization Toolkit for Neural Machine Translation Research (Zhang et al., ACL 2023)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2023.acl-demo.15.pdf