This article reports a multifaceted comparison between statistical
and neural machine translation (MT) systems that were developed for translation of data from Massive Open Online Courses (MOOCs). The study uses four
language pairs: English to German, Greek, Portuguese, and Russian. Translation quality is evaluated using automatic metrics and human evaluation, carried out by professional translators. Results show that neural MT is preferred
in side-by-side ranking, and is found to contain fewer overall errors. Results
are less clear-cut for some error categories, and for temporal and technical
post-editing effort. In addition, results are reported based on sentence length,
showing advantages and disadvantages depending on the particular language
pair and MT paradigm.