Pre-ordering of phrase-based machine translation input in translation workflow

Alexandru Ceausu, Sabine Hunsicker


Abstract
Word reordering is a difficult task for decoders when the languages involved have a significant difference in syntax. Phrase-based statistical machine translation (PBSMT), preferred in commercial settings due to its maturity, is particularly prone to errors in long range reordering. Source sentence pre-ordering, as a pre-processing step before PBSMT, proved to be an efficient solution that can be achieved using limited resources. We propose a dependency-based pre-ordering model with parameters optimized using a reordering score to pre-order the source sentence. The source sentence is then translated using an existing phrase-based system. The proposed solution is very simple to implement. It uses a hierarchical phrase-based statistical machine translation system (HPBSMT) for pre-ordering, combined with a PBSMT system for the actual translation. We show that the system can provide alternate translations of less post-editing effort in a translation workflow with German as the source language.
Anthology ID:
L14-1147
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3589–3592
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1213_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Alexandru Ceausu and Sabine Hunsicker. 2014. Pre-ordering of phrase-based machine translation input in translation workflow. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3589–3592, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Pre-ordering of phrase-based machine translation input in translation workflow (Ceausu & Hunsicker, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/1213_Paper.pdf