On the Use of Grammar Based Language Models for Statistical Machine Translation

Hassan Sawaf, Kai Schütz, Hermann Ney


Abstract
In this paper, we describe some concepts of language models beyond the usually used standard trigram and use such language models for statistical machine translation. In statistical machine translation the language model is the a-priori knowledge source of the system about the target language. One important requirement for the language model is the correct word order, given a certain choice of words, and to score the translations generated by the translation model Pr(f1J/eI1), in view of the syntactic context. In addition to standard m-grams with long histories, we examine the use of Part-of-Speech based models as well as linguistically motivated grammars with stochastic parsing as a special type of language model. Translation results are given on the VERBMOBIL task, where translation is performed from German to English, with vocabulary sizes of 6500 and 4000 words, respectively.
Anthology ID:
2000.iwpt-1.23
Volume:
Proceedings of the Sixth International Workshop on Parsing Technologies
Month:
February 23-25
Year:
2000
Address:
Trento, Italy
Venue:
IWPT
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
231–241
Language:
URL:
https://aclanthology.org/2000.iwpt-1.23
DOI:
Bibkey:
Cite (ACL):
Hassan Sawaf, Kai Schütz, and Hermann Ney. 2000. On the Use of Grammar Based Language Models for Statistical Machine Translation. In Proceedings of the Sixth International Workshop on Parsing Technologies, pages 231–241, Trento, Italy. Association for Computational Linguistics.
Cite (Informal):
On the Use of Grammar Based Language Models for Statistical Machine Translation (Sawaf et al., IWPT 2000)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2000.iwpt-1.23.pdf