Monotone statistical translation using word groups

Jesús Tomás, Francisco Casacuberta


Abstract
A new system for statistical natural language translation for languages with similar grammar is introduced. Specifically, it can be used with Romanic Languages, such as French, Spanish or Catalan. The statistical translation uses two sources of information: a language model and a translation model. The language model used is a standard trigram model. A new approach is defined in the translation model. The two main properties of the translation model are: the translation probabilities are computed between groups of words and the alignment between those groups is monotone. That is, the order between the word groups in the source sentence is conserved in the target sentence. Once, the translation model has been defined, we present an algorithm to infer its parameters from training samples. The translation process is carried out with an efficient algorithm based on stack-decoding. Finally, we present some translation results from Catalan to Spanish and compare our model with other conventional models.
Anthology ID:
2001.mtsummit-papers.64
Volume:
Proceedings of Machine Translation Summit VIII
Month:
September 18-22
Year:
2001
Address:
Santiago de Compostela, Spain
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2001.mtsummit-papers.64
DOI:
Bibkey:
Cite (ACL):
Jesús Tomás and Francisco Casacuberta. 2001. Monotone statistical translation using word groups. In Proceedings of Machine Translation Summit VIII, Santiago de Compostela, Spain.
Cite (Informal):
Monotone statistical translation using word groups (Tomás & Casacuberta, MTSummit 2001)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2001.mtsummit-papers.64.pdf