Speech Translation with Style: AppTek’s Submissions to the IWSLT Subtitling and Formality Tracks in 2023
Parnia Bahar, Patrick Wilken, Javier Iranzo-Sánchez, Mattia Di Gangi, Evgeny Matusov, Zoltán Tüske
Abstract
AppTek participated in the subtitling and formality tracks of the IWSLT 2023 evaluation. This paper describes the details of our subtitling pipeline - speech segmentation, speech recognition, punctuation prediction and inverse text normalization, text machine translation and direct speech-to-text translation, intelligent line segmentation - and how we make use of the provided subtitling-specific data in training and fine-tuning. The evaluation results show that our final submissions are competitive, in particular outperforming the submissions by other participants by 5% absolute as measured by the SubER subtitle quality metric. For the formality track, we participate with our En-Ru and En-Pt production models, which support formality control via prefix tokens. Except for informal Portuguese, we achieve near perfect formality level accuracy while at the same time offering high general translation quality.- Anthology ID:
- 2023.iwslt-1.22
- Volume:
- Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada (in-person and online)
- Editors:
- Elizabeth Salesky, Marcello Federico, Marine Carpuat
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 251–260
- Language:
- URL:
- https://aclanthology.org/2023.iwslt-1.22
- DOI:
- 10.18653/v1/2023.iwslt-1.22
- Cite (ACL):
- Parnia Bahar, Patrick Wilken, Javier Iranzo-Sánchez, Mattia Di Gangi, Evgeny Matusov, and Zoltán Tüske. 2023. Speech Translation with Style: AppTek’s Submissions to the IWSLT Subtitling and Formality Tracks in 2023. In Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), pages 251–260, Toronto, Canada (in-person and online). Association for Computational Linguistics.
- Cite (Informal):
- Speech Translation with Style: AppTek’s Submissions to the IWSLT Subtitling and Formality Tracks in 2023 (Bahar et al., IWSLT 2023)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2023.iwslt-1.22.pdf