Jun Yao


2021

pdf
HW-TSC’s Participation in the WMT 2021 Efficiency Shared Task
Hengchao Shang | Ting Hu | Daimeng Wei | Zongyao Li | Jianfei Feng | ZhengZhe Yu | Jiaxin Guo | Shaojun Li | Lizhi Lei | ShiMin Tao | Hao Yang | Jun Yao | Ying Qin
Proceedings of the Sixth Conference on Machine Translation

This paper presents the submission of Huawei Translation Services Center (HW-TSC) to WMT 2021 Efficiency Shared Task. We explore the sentence-level teacher-student distillation technique and train several small-size models that find a balance between efficiency and quality. Our models feature deep encoder, shallow decoder and light-weight RNN with SSRU layer. We use Huawei Noah’s Bolt, an efficient and light-weight library for on-device inference. Leveraging INT8 quantization, self-defined General Matrix Multiplication (GEMM) operator, shortlist, greedy search and caching, we submit four small-size and efficient translation models with high translation quality for the one CPU core latency track.