Abstract
Currently most state-of-the-art statistical machine translation systems present a mismatch between training and generation conditions. Word alignments are computed using the well known IBM models for single-word based translation. Afterwards phrases are extracted using extraction heuristics, unrelated to the stochastic models applied for finding the word alignment. In the last years, several research groups have tried to overcome this mismatch, but only with limited success. Recently, the technique of forced alignments has shown to improve translation quality for a phrase-based system, applying a more statistically sound approach to phrase extraction. In this work we investigate the first steps to combine forced alignment with a hierarchical model. Experimental results on IWSLT and WMT data show improvements in translation quality of up to 0.7% BLEU and 1.0% TER.- Anthology ID:
- 2010.iwslt-papers.11
- Volume:
- Proceedings of the 7th International Workshop on Spoken Language Translation: Papers
- Month:
- December 2-3
- Year:
- 2010
- Address:
- Paris, France
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Note:
- Pages:
- 291–297
- Language:
- URL:
- https://aclanthology.org/2010.iwslt-papers.11
- DOI:
- Cite (ACL):
- Carmen Heger, Joern Wuebker, David Vilar, and Hermann Ney. 2010. A combination of hierarchical systems with forced alignments from phrase-based systems. In Proceedings of the 7th International Workshop on Spoken Language Translation: Papers, pages 291–297, Paris, France.
- Cite (Informal):
- A combination of hierarchical systems with forced alignments from phrase-based systems (Heger et al., IWSLT 2010)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2010.iwslt-papers.11.pdf