CKDST: Comprehensively and Effectively Distill Knowledge from Machine Translation to End-to-End Speech Translation

Yikun Lei; Zhengshan Xue; Xiaohu Zhao; Haoran Sun; Shaolin Zhu; Xiaodong Lin; Deyi Xiong

doi:10.18653/v1/2023.findings-acl.195

CKDST: Comprehensively and Effectively Distill Knowledge from Machine Translation to End-to-End Speech Translation

Yikun Lei, Zhengshan Xue, Xiaohu Zhao, Haoran Sun, Shaolin Zhu, Xiaodong Lin, Deyi Xiong

Abstract

Distilling knowledge from a high-resource task, e.g., machine translation, is an effective way to alleviate the data scarcity problem of end-to-end speech translation. However, previous works simply use the classical knowledge distillation that does not allow for adequate transfer of knowledge from machine translation. In this paper, we propose a comprehensive knowledge distillation framework for speech translation, CKDST, which is capable of comprehensively and effectively distilling knowledge from machine translation to speech translation from two perspectives: cross-modal contrastive representation distillation and simultaneous decoupled knowledge distillation. In the former, we leverage a contrastive learning objective to optmize the mutual information between speech and text representations for representation distillation in the encoder. In the later, we decouple the non-target class knowledge from target class knowledge for logits distillation in the decoder. Experiments on the MuST-C benchmark dataset demonstrate that our CKDST substantially improves the baseline by 1.2 BLEU on average in all translation directions, and outperforms previous state-of-the-art end-to-end and cascaded speech translation models.

Anthology ID:: 2023.findings-acl.195
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3123–3137
Language:
URL:: https://aclanthology.org/2023.findings-acl.195
DOI:: 10.18653/v1/2023.findings-acl.195
Bibkey:
Cite (ACL):: Yikun Lei, Zhengshan Xue, Xiaohu Zhao, Haoran Sun, Shaolin Zhu, Xiaodong Lin, and Deyi Xiong. 2023. CKDST: Comprehensively and Effectively Distill Knowledge from Machine Translation to End-to-End Speech Translation. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3123–3137, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: CKDST: Comprehensively and Effectively Distill Knowledge from Machine Translation to End-to-End Speech Translation (Lei et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/proper-vol2-ingestion/2023.findings-acl.195.pdf

PDF Search