Source aware phrase-based decoding for robust conversational spoken language translation
Sankaranarayanan Ananthakrishnan, Wei Chen, Rohit Kumar, Dennis Mehay
Abstract
Spoken language translation (SLT) systems typically follow a pipeline architecture, in which the best automatic speech recognition (ASR) hypothesis of an input utterance is fed into a statistical machine translation (SMT) system. Conversational speech often generates unrecoverable ASR errors owing to its rich vocabulary (e.g. out-of-vocabulary (OOV) named entities). In this paper, we study the possibility of alleviating the impact of unrecoverable ASR errors on translation performance by minimizing the contextual effects of incorrect source words in target hypotheses. Our approach is driven by locally-derived penalties applied to bilingual phrase pairs as well as target language model (LM) likelihoods in the vicinity of source errors. With oracle word error labels on an OOV word-rich English-to-Iraqi Arabic translation task, we show statistically significant relative improvements of 3.2% BLEU and 2.0% METEOR over an error-agnostic baseline SMT system. We then investigate the impact of imperfect source error labels on error-aware translation performance. Simulation experiments reveal that modest translation improvements are to be gained with this approach even when the source error labels are noisy.- Anthology ID:
- 2013.iwslt-papers.17
- Volume:
- Proceedings of the 10th International Workshop on Spoken Language Translation: Papers
- Month:
- December 5-6
- Year:
- 2013
- Address:
- Heidelberg, Germany
- Editor:
- Joy Ying Zhang
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2013.iwslt-papers.17
- DOI:
- Cite (ACL):
- Sankaranarayanan Ananthakrishnan, Wei Chen, Rohit Kumar, and Dennis Mehay. 2013. Source aware phrase-based decoding for robust conversational spoken language translation. In Proceedings of the 10th International Workshop on Spoken Language Translation: Papers, Heidelberg, Germany.
- Cite (Informal):
- Source aware phrase-based decoding for robust conversational spoken language translation (Ananthakrishnan et al., IWSLT 2013)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2013.iwslt-papers.17.pdf