Abstract
End-to-end simultaneous speech-to-text translation aims to directly perform translation from streaming source speech to target text with high translation quality and low latency. A typical simultaneous translation (ST) system consists of a speech translation model and a policy module, which determines when to wait and when to translate. Thus the policy is crucial to balance translation quality and latency. Conventional methods usually adopt fixed policies, e.g. segmenting the source speech with a fixed length and generating translation. However, this method ignores contextual information and suffers from low translation quality. This paper proposes an adaptive segmentation policy for end-to-end ST. Inspired by human interpreters, the policy learns to segment the source streaming speech into meaningful units by considering both acoustic features and translation history, maintaining consistency between the segmentation and translation. Experimental results on English-German and Chinese-English show that our method achieves a good accuracy-latency trade-off over recently proposed state-of-the-art methods.- Anthology ID:
- 2022.acl-long.542
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7862–7874
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.542
- DOI:
- 10.18653/v1/2022.acl-long.542
- Cite (ACL):
- Ruiqing Zhang, Zhongjun He, Hua Wu, and Haifeng Wang. 2022. Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7862–7874, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation (Zhang et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.acl-long.542.pdf
- Data
- BSTC, MuST-C