This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
Mosleh HmoudAl-Adhaileh
Also published as:
Mosleh H. Al-Adhaileh
Fixing paper assignments
Please select all papers that do not belong to this person.
Indicate below which author they should be assigned to.
In this paper, we would like to present an approach to construct a huge Bilingual Knowledge Bank (BKB) from an English Malay bilingual dictionary based on the idea of synchronous Structured String-Tree Correspondence (SSTC). The SSTC is a general structure that can associate an arbitrary tree structure to string in a language as desired by the annotator to be the interpretation structure of the string, and more importantly is the facility to specify the correspondence between the string and the associated tree which can be non-projective. With this structure, we are able to match linguistic units at different inter levels of the structure (i.e. define the correspondence between substrings in the sentence, nodes in the tree, subtrees in the tree and sub-correspondences in the SSTC). This flexibility makes synchronous SSTC very well suited for the construction of a Bilingual Knowledge Bank we need for the English-Malay MT application.
In this paper, we describe an Example-Based Machine Translation (EBMT) system for English-Malay translation. Our approach is an example-based approach which relies sorely on example translations kept in a Bilingual Knowledge Bank (BKB). In our approach, a flexible annotation schema called Structured String-Tree Correspondence (SSTC) is used to annotate both the source and target sentences of a translation pair. Each SSTC describes a sentence, a representation tree as well as the correspondences between substrings in the sentence and subtrees in the representation tree. With both the source and target SSTCs established, a translation example in the BKB can then be represented effectively in terms of a pair of synchronous SSTCs. In the process of translation, we first try to build the representation tree for the source sentence (English) based on the example-based parsing algorithm as presented in [1]. By referring to the resultant source parse tree, we then proceed to synthesis the target sentence (Malay) based on the target SSTCs as pointed to by the synchronous SSTCs which encode the relationship between source and target SSTCs.