Exploring Supervised and Unsupervised Rewards in Machine Translation
Lucia Specia | Marina Fomicheva | Julia Ive | Zixu Wang | Lucia Specia | Julia Ive | M. Fomicheva | Zixu Wang
[1] Philipp Koehn,et al. Six Challenges for Neural Machine Translation , 2017, NMT@ACL.
[2] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[3] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[4] Rico Sennrich,et al. Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.
[5] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[6] Wei Zhao,et al. SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization , 2020, ACL.
[7] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[8] Colin Cherry,et al. A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU , 2014, WMT@ACL.
[9] Di He,et al. Decoding with Value Networks for Neural Machine Translation , 2017, NIPS.
[10] Alon Lavie,et al. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.
[11] Lucia Specia,et al. Grounded Word Sense Translation , 2019, Proceedings of the Second Workshop on Shortcomings in Vision and Language.
[12] Petros Christodoulou,et al. Soft Actor-Critic for Discrete Action Settings , 2019, ArXiv.
[13] Mauro Cettolo,et al. WIT3: Web Inventory of Transcribed and Translated Talks , 2012, EAMT.
[14] Yang Liu,et al. Minimum Risk Training for Neural Machine Translation , 2015, ACL.
[15] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[16] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[17] Lucia Specia,et al. Multimodal Lexical Translation , 2018, LREC.
[18] Philipp Koehn,et al. Findings of the 2017 Conference on Machine Translation (WMT17) , 2017, WMT.
[19] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[20] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[21] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[22] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[23] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[24] Omri Abend,et al. On the Weaknesses of Reinforcement Learning for Neural Machine Translation , 2019, ICLR.
[25] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[26] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[27] Eduard H. Hovy,et al. From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction , 2018, ACL.
[28] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[29] Desmond Elliott,et al. Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description , 2017, WMT.
[30] Simultaneous Machine Translation with Visual Context , 2020, EMNLP.
[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[32] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.
[33] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[34] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[35] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[36] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.
[37] Naren Ramakrishnan,et al. Deep Reinforcement Learning for Sequence-to-Sequence Models , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[38] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[39] Lantao Yu,et al. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient , 2016, AAAI.
[40] Alon Lavie,et al. Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.
[41] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[42] Stefan Riezler,et al. Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning , 2018, ACL.
[43] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.