Abstract
End-to-end approaches have shown promising results for speech translation (ST), but they suffer from its data scarcity compared to machine translation (MT). To address this, progressive training has become a common practice, of using external MT data during the fine-tuning phase. Despite of its prevalence and computational overhead, its validity is not extensively corroborated yet. This paper conducts an empirical investigation and finds that progressive training is ineffective. We identify learning-forgetting trade-off as a critical obstacle, then hypothesize and verify that consistency learning (CL) breaks the dilemma of learning-forgetting. The proposed method, which combines knowledge distillation (KD) and CL, outperforms the previous methods on MuST-C dataset even without additional data, and our proposed consistency-informed KD achieves additional improvements against KD+CL. Code and models are availble at https://github.com/hjlee1371/consistency-s2tt.- Anthology ID:
- 2023.findings-emnlp.905
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 13572–13581
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.905
- DOI:
- 10.18653/v1/2023.findings-emnlp.905
- Cite (ACL):
- Hojin Lee, Changmin Lee, and Seung-won Hwang. 2023. Consistency is Key: On Data-Efficient Modality Transfer in Speech Translation. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 13572–13581, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Consistency is Key: On Data-Efficient Modality Transfer in Speech Translation (Lee et al., Findings 2023)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2023.findings-emnlp.905.pdf