A combination of hierarchical systems with forced alignments from phrase-based systems

Carmen Heger, Joern Wuebker, David Vilar, Hermann Ney


Abstract
Currently most state-of-the-art statistical machine translation systems present a mismatch between training and generation conditions. Word alignments are computed using the well known IBM models for single-word based translation. Afterwards phrases are extracted using extraction heuristics, unrelated to the stochastic models applied for finding the word alignment. In the last years, several research groups have tried to overcome this mismatch, but only with limited success. Recently, the technique of forced alignments has shown to improve translation quality for a phrase-based system, applying a more statistically sound approach to phrase extraction. In this work we investigate the first steps to combine forced alignment with a hierarchical model. Experimental results on IWSLT and WMT data show improvements in translation quality of up to 0.7% BLEU and 1.0% TER.
Anthology ID:
2010.iwslt-papers.11
Volume:
Proceedings of the 7th International Workshop on Spoken Language Translation: Papers
Month:
December 2-3
Year:
2010
Address:
Paris, France
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
291–297
Language:
URL:
https://aclanthology.org/2010.iwslt-papers.11
DOI:
Bibkey:
Cite (ACL):
Carmen Heger, Joern Wuebker, David Vilar, and Hermann Ney. 2010. A combination of hierarchical systems with forced alignments from phrase-based systems. In Proceedings of the 7th International Workshop on Spoken Language Translation: Papers, pages 291–297, Paris, France.
Cite (Informal):
A combination of hierarchical systems with forced alignments from phrase-based systems (Heger et al., IWSLT 2010)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2010.iwslt-papers.11.pdf