Lexical syntax for Arabic SMT

The current approaches of Phrase-based Statistical Machine Translation lacks the capabilities of producing grammatical translations and handling long-range reordering. In this chapter, we presnet our work for extending Phrase-based SMT with lexical syntactic descriptions that localize global syntactic information on the word without introducing syntactic redundant ambiguity. We presente a novel model of Phrase-based SMT which integrates linguistic lexical descriptions supertags into the target language model and the target side of the translation model. Moreover, we introduce a novel Incremental Dependency-based Syntactic Language Model (IDLM) based on wide-coverage CCG incremental parsing which we integrate into a direct translation SMT system. Our proposed approach is the first to integrate full dependency parsing in SMT systems with a very attractive computational cost since it deploys the linear decoders widely used in Phrase-based SMT systems. The experimental results. show a good improvement over top-ranked state-of-the-art systems.