Hanyi Zhang
2023
Submission of USTC’s System for the IWSLT 2023 - Offline Speech Translation Track
Xinyuan Zhou
|
Jianwei Cui
|
Zhongyi Ye
|
Yichi Wang
|
Luzhen Xu
|
Hanyi Zhang
|
Weitai Zhang
|
Lirong Dai
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
This paper describes the submissions of the research group USTC-NELSLIP to the 2023 IWSLT Offline Speech Translation competition, which involves translating spoken English into written Chinese. We utilize both cascaded models and end-to-end models for this task. To improve the performance of the cascaded models, we introduce Whisper to reduce errors in the intermediate source language text, achieving a significant improvement in ASR recognition performance. For end-to-end models, we propose Stacked Acoustic-and-Textual En- coding extension (SATE-ex), which feeds the output of the acoustic decoder into the textual decoder for information fusion and to prevent error propagation. Additionally, we improve the performance of the end-to-end system in translating speech by combining the SATE-ex model with the encoder-decoder model through ensembling.
Search
Co-authors
- Xinyuan Zhou 1
- Jianwei Cui 1
- Zhongyi Ye 1
- Yichi Wang 1
- Luzhen Xu 1
- show all...