Example-Based Machine Translation Based on the Synchronous SSTC Annotation Schema

In this paper, we describe an Example-Based Machine Translation (EBMT) system for EnglishMalay translation. Our approach is an examplebased approach which relies sorely on example translations kept in a Bilingual Knowledge Bank (BKB). In our approach, a flexible annotation schema called Structured String-Tree Correspondence (SSTC) is used to annotate both the source and target sentences of a translation pair. Each SSTC describes a sentence, a representation tree as well as the correspondences between substrings in the sentence and subtrees in the representation tree. With both the source and target SSTCs established, a translation example in the BKB can then be represented effectively in terms of a pair of synchronous SSTCs. In the process of translation, we first try to build the representation tree for the source sentence (English) based on the example-based parsing algorithm as presented in [1]. By referring to the resultant source parse tree, we then proceed to synthesis the target sentence (Malay) based on the target SSTCs as pointed to by the synchronous SSTCs which encode the relationship between source and target SSTCs.