Interactive-Predictive Neural Machine Translation through Reinforcement and Imitation

We propose an interactive-predictive neural machine translation framework for easier model personalization using reinforcement and imitation learning. During the interactive translation process, the user is asked for feedback on uncertain locations identified by the system. Responses are weak feedback in the form of "keep" and "delete" edits, and expert demonstrations in the form of "substitute" edits. Conditioning on the collected feedback, the system creates alternative translations via constrained beam search. In simulation experiments on two language pairs our systems get close to the performance of supervised training with much less human effort.

[1]  Nolan Wagener,et al.  Fast Policy Learning through Imitation and Reinforcement , 2018, UAI.

[2]  Hermann Ney,et al.  Statistical Approaches to Computer-Assisted Translation , 2009, CL.

[3]  Joelle Pineau,et al.  An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[6]  Matt Post,et al.  Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation , 2018, NAACL.

[7]  Maja Popovic,et al.  chrF: character n-gram F-score for automatic MT evaluation , 2015, WMT@EMNLP.

[8]  Francisco Casacuberta,et al.  Online Learning for Neural Machine Translation Post-editing , 2017, ArXiv.

[9]  Stefan Riezler,et al.  Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning , 2018, ACL.

[10]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[11]  Benjamin Marie,et al.  Touch-Based Pre-Post-Editing of Machine Translation Output , 2015, EMNLP.

[12]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[13]  Stefan Riezler,et al.  Bandit Structured Prediction for Neural Sequence-to-Sequence Learning , 2017, ACL.

[14]  George F. Foster,et al.  User-Friendly Text Prediction For Translators , 2002, EMNLP.

[15]  Francisco Casacuberta,et al.  Active Learning for Interactive Neural Machine Translation of Data Streams , 2018, CoNLL.

[16]  John DeNero,et al.  Models and Inference for Prefix-Constrained Machine Translation , 2016, ACL.

[17]  Graham Neubig,et al.  Extreme Adaptation for Personalized Neural Machine Translation , 2018, ACL.

[18]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[19]  Francisco Casacuberta,et al.  An active learning scenario for interactive machine translation , 2011, ICMI '11.

[20]  Philipp Koehn,et al.  Neural Interactive Translation Prediction , 2016, AMTA.

[21]  Marcello Federico,et al.  Continuous Learning from Human Post-Edits for Neural Machine Translation , 2017, Prague Bull. Math. Linguistics.

[22]  Stefan Riezler,et al.  A user-study on online adaptation of neural machine translation to human post-edits , 2017, Machine Translation.

[23]  Pierre Isabelle,et al.  Target-Text Mediated Interactive Machine Translation , 2004, Machine Translation.

[24]  Hal Daumé,et al.  Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback , 2017, EMNLP.

[25]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[26]  Francisco Casacuberta,et al.  Active learning for interactive machine translation , 2012, EACL.

[27]  Qun Liu,et al.  Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search , 2017, ACL.

[28]  Mark Craven,et al.  An Analysis of Active Learning Strategies for Sequence Labeling Tasks , 2008, EMNLP.

[29]  Stefan Riezler,et al.  A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation , 2018, EAMT.

[30]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[31]  Lijun Wu,et al.  Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[32]  Jeffrey Heer,et al.  Human Effort and Machine Learnability in Computer Aided Translation , 2014, EMNLP.

[33]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[34]  Peter Stone,et al.  Reinforcement learning , 2019, Scholarpedia.

[35]  John DeNero,et al.  Compact Personalized Models for Neural Machine Translation , 2018, EMNLP.

[36]  Vladimir Eidelman,et al.  cdec: A Decoder, Alignment, and Learning Framework for Finite- State and Context-Free Translation Models , 2010, ACL.

[37]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[38]  Francisco Casacuberta,et al.  Online Learning for Interactive Statistical Machine Translation , 2010, NAACL.