End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021
Gerard I. Gállego, Ioannis Tsiamas, Carlos Escolano, José A. R. Fonollosa, Marta R. Costa-jussà
Abstract
This paper describes the submission to the IWSLT 2021 offline speech translation task by the UPC Machine Translation group. The task consists of building a system capable of translating English audio recordings extracted from TED talks into German text. Submitted systems can be either cascade or end-to-end and use a custom or given segmentation. Our submission is an end-to-end speech translation system, which combines pre-trained models (Wav2Vec 2.0 and mBART) with coupling modules between the encoder and decoder, and uses an efficient fine-tuning technique, which trains only 20% of its total parameters. We show that adding an Adapter to the system and pre-training it, can increase the convergence speed and the final result, with which we achieve a BLEU score of 27.3 on the MuST-C test set. Our final model is an ensemble that obtains 28.22 BLEU score on the same set. Our submission also uses a custom segmentation algorithm that employs pre-trained Wav2Vec 2.0 for identifying periods of untranscribable text and can bring improvements of 2.5 to 3 BLEU score on the IWSLT 2019 test set, as compared to the result with the given segmentation.- Anthology ID:
- 2021.iwslt-1.11
- Volume:
- Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
- Month:
- August
- Year:
- 2021
- Address:
- Bangkok, Thailand (online)
- Editors:
- Marcello Federico, Alex Waibel, Marta R. Costa-jussà, Jan Niehues, Sebastian Stuker, Elizabeth Salesky
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 110–119
- Language:
- URL:
- https://aclanthology.org/2021.iwslt-1.11
- DOI:
- 10.18653/v1/2021.iwslt-1.11
- Cite (ACL):
- Gerard I. Gállego, Ioannis Tsiamas, Carlos Escolano, José A. R. Fonollosa, and Marta R. Costa-jussà. 2021. End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 110–119, Bangkok, Thailand (online). Association for Computational Linguistics.
- Cite (Informal):
- End-to-End Speech Translation with Pre-trained Models and Adapters: UPC at IWSLT 2021 (Gállego et al., IWSLT 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2021.iwslt-1.11.pdf
- Code
- mt-upc/iwslt-2021
- Data
- CoVoST, CoVoST2, Europarl-ST, IWSLT 2019, MuST-C