Phrasal Cohesion and Statistical Machine Translation

There has been much interest in using phrasal movement to improve statistical machine translation. We explore how well phrases cohere across two languages, specifically English and French, and examine the particular conditions under which they do not. We demonstrate that while there are cases where coherence is poor, there are many regularities which can be exploited by a statistical machine translation system. We also compare three variant syntactic representations to determine which one has the best properties with respect to cohesion.