Abstract
This paper describes the BIGAI’s submission to IWSLT 2023 Offline Speech Translation task on three language tracks from English to Chinese, German and Japanese. The end-to-end systems are built upon a Wav2Vec2 model for speech recognition and mBART50 models for machine translation. An adapter module is applied to bridge the speech module and the translation module. The CTC loss between speech features and source token sequence is incorporated during training. Experiments show that the systems can generate reasonable translations on three languages. The proposed models achieve BLEU scores of 22.3 for en→de, 10.7 for en→ja and 33.0 for en→zh on tst2023 TED datasets. However, the performance is decreased by a significant margin on complex scenarios like persentations and interview.- Anthology ID:
- 2023.iwslt-1.7
- Volume:
- Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada (in-person and online)
- Editors:
- Elizabeth Salesky, Marcello Federico, Marine Carpuat
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 123–129
- Language:
- URL:
- https://aclanthology.org/2023.iwslt-1.7
- DOI:
- 10.18653/v1/2023.iwslt-1.7
- Cite (ACL):
- Zhihang Xie. 2023. The BIGAI Offline Speech Translation Systems for IWSLT 2023 Evaluation. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 123–129, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- Cite (Informal):
- The BIGAI Offline Speech Translation Systems for IWSLT 2023 Evaluation (Xie, IWSLT 2023)
- PDF:
- https://preview.aclanthology.org/corrections-2024-04/2023.iwslt-1.7.pdf