Linguistically-motivated Tree-based Probabilistic Phrase Alignment

Toshiaki Nakazawa, Sadao Kurohashi


Abstract
In this paper, we propose a probabilistic phrase alignment model based on dependency trees. This model is linguistically-motivated, using syntactic information during alignment process. The main advantage of this model is that the linguistic difference between source and target languages is successfully absorbed. It is composed of two models: Model1 is using content word translation probability and function word translation probability; Model2 uses dependency relation probability which is defined for a pair of positional relations on dependency trees. Relation probability acts as tree-based phrase reordering model. Since this model is directed, we combine two alignment results from bi-directional training by symmetrization heuristics to get definitive alignment. We conduct experiments on a Japanese-English corpus, and achieve reasonably high quality of alignment compared with word-based alignment model.
Anthology ID:
2008.amta-papers.15
Volume:
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers
Month:
October 21-25
Year:
2008
Address:
Waikiki, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
163–171
Language:
URL:
https://aclanthology.org/2008.amta-papers.15
DOI:
Bibkey:
Cite (ACL):
Toshiaki Nakazawa and Sadao Kurohashi. 2008. Linguistically-motivated Tree-based Probabilistic Phrase Alignment. In Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers, pages 163–171, Waikiki, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Linguistically-motivated Tree-based Probabilistic Phrase Alignment (Nakazawa & Kurohashi, AMTA 2008)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2008.amta-papers.15.pdf