Example-based Machine Translation Based on TSC and Statistical Generation

Zhanyi Liu, Haifeng Wang, Hua Wu


Abstract
This paper proposes a novel Example-Based Machine Translation (EBMT) method based on Tree String Correspondence (TSC) and statistical generation. In this method, the translation examples are represented as TSC, which consists of three parts: a parse tree in the source language, a string in the target language, and the correspondences between the leaf nodes of the source language tree and the substrings of the target language string. During the translation, the input sentence is first parsed into a tree. Then the TSC forest is searched out if it is best matched with the parse tree. The translation is generated by using a statistical generation model to combine the target language strings in the TSCs. The generation model consists of three parts: the semantic similarity between words, the word translation probability, and the target language model. Based on the above method, we build an English-to-Chinese Machine Translation (ECMT) system. Experimental results indicate that the performance of our system is comparable with that of the state-of-the-art commercial ECMT systems.
Anthology ID:
2005.mtsummit-papers.4
Volume:
Proceedings of Machine Translation Summit X: Papers
Month:
September 13-15
Year:
2005
Address:
Phuket, Thailand
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
25–32
Language:
URL:
https://aclanthology.org/2005.mtsummit-papers.4
DOI:
Bibkey:
Cite (ACL):
Zhanyi Liu, Haifeng Wang, and Hua Wu. 2005. Example-based Machine Translation Based on TSC and Statistical Generation. In Proceedings of Machine Translation Summit X: Papers, pages 25–32, Phuket, Thailand.
Cite (Informal):
Example-based Machine Translation Based on TSC and Statistical Generation (Liu et al., MTSummit 2005)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2005.mtsummit-papers.4.pdf