Automatic Translation for Software with Safe Velocity

We report on a model for machine translation (MT) of software, without review, for the Microsoft Office product range. We have deployed an automated localisation workflow, known as Automated Translation (AT) for software, which identifies resource strings as suitable and safe for MT without post-editing. The model makes use of string profiling, user impact assessment, MT quality estimation, and customer feedback mechanisms. This allows us to introduce automatic translation at a safe velocity, with a minimal risk to customer satisfaction. Quality constraints limit the volume of MT in relation to human translation, with published low-quality MT limited to not exceed 10% of total word count. The AT for software model has been deployed into production for most of the Office product range, for 37 languages. It allows us to MT and publish without review over 20% of the word count for some languages and products. To date, we have processed more than 1 million words with this model, and so far have not seen any measurable negative impact on customer satisfaction.

[1]  Maxim Khalilov Machine translation at Booking.com: what's next? , 2018, PostEditing@AMTA.

[2]  Philipp Koehn,et al.  Findings of the 2012 Workshop on Statistical Machine Translation , 2012, WMT@NAACL-HLT.

[3]  Lucia Specia,et al.  Machine translation evaluation versus quality estimation , 2010, Machine Translation.

[4]  Lucia Specia,et al.  Guiding Neural Machine Translation Decoding with External Knowledge , 2017, WMT.

[5]  Jong-Hyeok Lee,et al.  Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation , 2017, WMT.

[6]  Benjamin Lecouteux,et al.  LIG System for Word Level QE task at WMT14 , 2014, WMT@ACL.

[7]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[8]  François Masselot,et al.  A Productivity Test of Statistical Machine Translation Post-Editing in a Typical Localisation Context , 2010, Prague Bull. Math. Linguistics.

[9]  Pavel Levin,et al.  Machine Translation at Booking.com: Journey and Lessons Learned , 2017, ArXiv.

[10]  Gonzalo Iglesias,et al.  Neural Machine Translation Decoding with Terminology Constraints , 2018, NAACL.

[11]  Vasudeva Varma,et al.  Translation Quality Estimation for Indian Languages , 2018, EAMT.

[12]  Lucia Specia,et al.  Multi-level Translation Quality Prediction with QuEst++ , 2015, ACL.

[13]  Glen Poor Keynote: Use more Machine Translation and Keep Your Customers Happy , 2018, AMTA.

[14]  Lijun Wu,et al.  Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[15]  M. Sasikumar,et al.  Translation Quality Estimation using Recurrent Neural Network , 2016, WMT.

[16]  Ramón Fernández Astudillo,et al.  Pushing the Limits of Translation Quality Estimation , 2017, TACL.

[17]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation , 2007, CL.

[18]  Soumya Batra,et al.  Giving voice to office customers: Best practices in how office handles verbatim text feedback , 2016, 2016 IEEE International Conference on Big Data (Big Data).