Abstract
We present a syntax-based language model for use in noisy-channel machine translation. In particular, a language model based upon that described in (Cha01) is combined with the syntax based translation-model described in (YK01). The resulting system was used to translate 347 sentences from Chinese to English and compared with the results of an IBM-model-4-based system, as well as that of (YK02), all trained on the same data. The translations were sorted into four groups: good/bad syntax crossed with good/bad meaning. While the total number of translations that preserved meaning were the same for (YK02) and the syntax-based system (and both higher than the IBM-model-4-based system), the syntax based system had 45% more translations that also had good syntax than did (YK02) (and approximately 70% more than IBM Model 4). The number of translations that did not preserve meaning, but at least had good grammar, also increased, though to less avail.- Anthology ID:
- 2003.mtsummit-papers.6
- Volume:
- Proceedings of Machine Translation Summit IX: Papers
- Month:
- September 23-27
- Year:
- 2003
- Address:
- New Orleans, USA
- Venue:
- MTSummit
- SIG:
- Publisher:
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2003.mtsummit-papers.6
- DOI:
- Cite (ACL):
- Eugene Charniak, Kevin Knight, and Kenji Yamada. 2003. Syntax-based language models for statistical machine translation. In Proceedings of Machine Translation Summit IX: Papers, New Orleans, USA.
- Cite (Informal):
- Syntax-based language models for statistical machine translation (Charniak et al., MTSummit 2003)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2003.mtsummit-papers.6.pdf