PaCo-MT

The PaCo-MT project is building a stochastic example-based transfer system translating from Dutch into English and French, and vice versa. It is a data-driven tree-to-tree based approach towards MT, transducing the input parse tree into a set of target language parse trees without node ordering. This Synchronous Tree Substitution Grammar (limited to regular subtrees) is induced from a subtree-aligned parallel treebank, using a discriminative model for tree alignment. Monolingual parses were created by pre-existing parsers, such as the Alpino parser for Dutch, the Stanford parser for English, and the Berkeley parser for French. A tree-based target language modeler using a probabilistic context-free grammar based on large monolingual treebanks decodes the output forest and determines node ordering.