Improve Speech Translation Through Text Rewrite
Jing Wu, Shushu Wang, Kai Fan, Wei Luo, Minpeng Liao, Zhongqiang Huang
Abstract
Despite recent progress in Speech Translation (ST) research, the challenges posed by inherent speech phenomena that distinguish transcribed speech from written text are not well addressed. The informal and erroneous nature of spontaneous speech is inadequately represented in the typical parallel text available for building translation models. We propose to address these issues through a text rewrite approach that aims to transform transcribed speech into a cleaner style more in line with the expectations of translation models built from written text. Moreover, the advantages of the rewrite model can be effectively distilled into a standalone translation model. Experiments on several benchmarks, using both publicly available and in-house translation models, demonstrate that adding a rewrite model to a traditional ST pipeline is a cost-effect way to address a variety of speech irregularities and improve speech translation quality for multiple language directions and domains.- Anthology ID:
- 2025.coling-industry.28
- Volume:
- Proceedings of the 31st International Conference on Computational Linguistics: Industry Track
- Month:
- January
- Year:
- 2025
- Address:
- Abu Dhabi, UAE
- Editors:
- Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert, Kareem Darwish, Apoorv Agarwal
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 331–342
- Language:
- URL:
- https://preview.aclanthology.org/add-emnlp-2024-awards/2025.coling-industry.28/
- DOI:
- Cite (ACL):
- Jing Wu, Shushu Wang, Kai Fan, Wei Luo, Minpeng Liao, and Zhongqiang Huang. 2025. Improve Speech Translation Through Text Rewrite. In Proceedings of the 31st International Conference on Computational Linguistics: Industry Track, pages 331–342, Abu Dhabi, UAE. Association for Computational Linguistics.
- Cite (Informal):
- Improve Speech Translation Through Text Rewrite (Wu et al., COLING 2025)
- PDF:
- https://preview.aclanthology.org/add-emnlp-2024-awards/2025.coling-industry.28.pdf