The Kyoto Speech-to-Speech Translation System for IWSLT 2023

Zhengdong Yang, Shuichiro Shimizu, Wangjin Zhou, Sheng Li, Chenhui Chu


Abstract
This paper describes the Kyoto speech-to-speech translation system for IWSLT 2023. Our system is a combination of speech-to-text translation and text-to-speech synthesis. For the speech-to-text translation model, we used the dual-decoderTransformer model. For text-to-speech synthesis model, we took a cascade approach of an acoustic model and a vocoder.
Anthology ID:
2023.iwslt-1.33
Volume:
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada (in-person and online)
Editors:
Elizabeth Salesky, Marcello Federico, Marine Carpuat
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
357–362
Language:
URL:
https://aclanthology.org/2023.iwslt-1.33
DOI:
10.18653/v1/2023.iwslt-1.33
Bibkey:
Cite (ACL):
Zhengdong Yang, Shuichiro Shimizu, Wangjin Zhou, Sheng Li, and Chenhui Chu. 2023. The Kyoto Speech-to-Speech Translation System for IWSLT 2023. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 357–362, Toronto, Canada (in-person and online). Association for Computational Linguistics.
Cite (Informal):
The Kyoto Speech-to-Speech Translation System for IWSLT 2023 (Yang et al., IWSLT 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.iwslt-1.33.pdf