Inverted Projection for Robust Speech Translation

Dirk Padfield, Colin Cherry


Abstract
Traditional translation systems trained on written documents perform well for text-based translation but not as well for speech-based applications. We aim to adapt translation models to speech by introducing actual lexical errors from ASR and segmentation errors from automatic punctuation into our translation training data. We introduce an inverted projection approach that projects automatically detected system segments onto human transcripts and then re-segments the gold translations to align with the projected human transcripts. We demonstrate that this overcomes the train-test mismatch present in other training approaches. The new projection approach achieves gains of over 1 BLEU point over a baseline that is exposed to the human transcripts and segmentations, and these gains hold for both IWSLT data and YouTube data.
Anthology ID:
2021.iwslt-1.28
Volume:
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)
Month:
August
Year:
2021
Address:
Bangkok, Thailand (online)
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Association for Computational Linguistics
Note:
Pages:
236–244
Language:
URL:
https://aclanthology.org/2021.iwslt-1.28
DOI:
10.18653/v1/2021.iwslt-1.28
Bibkey:
Cite (ACL):
Dirk Padfield and Colin Cherry. 2021. Inverted Projection for Robust Speech Translation. In Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021), pages 236–244, Bangkok, Thailand (online). Association for Computational Linguistics.
Cite (Informal):
Inverted Projection for Robust Speech Translation (Padfield & Cherry, IWSLT 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2021.iwslt-1.28.pdf