Reward Augmented Maximum Likelihood for Neural Structured Prediction
暂无分享,去创建一个
Dale Schuurmans | Samy Bengio | Navdeep Jaitly | Yonghui Wu | Mohammad Norouzi | Zhifeng Chen | Mike Schuster | Dale Schuurmans | Samy Bengio | Navdeep Jaitly | Z. Chen | Mohammad Norouzi | M. Schuster | Yonghui Wu | N. Jaitly
[1] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[2] Miss A.O. Penney. (b) , 1974, The New Yale Book of Quotations.
[3] P. Strevens. Iii , 1985 .
[4] Jing Peng,et al. Function Optimization using Connectionist Reinforcement Learning Algorithms , 1991 .
[5] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[6] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[7] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[8] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[9] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[10] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.
[11] A. ADoefaa,et al. ? ? ? ? f ? ? ? ? ? , 2003 .
[12] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..
[15] Inderjit S. Dhillon,et al. Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..
[16] Emanuel Todorov,et al. Linearly-solvable Markov decision problems , 2006, NIPS.
[17] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.
[18] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[19] Tamir Hazan,et al. Direct Loss Minimization for Structured Prediction , 2010, NIPS.
[20] Noah A. Smith,et al. Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions , 2010, NAACL.
[21] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[22] Veselin Stoyanov,et al. Empirical Risk Minimization of Graphical Model Parameters Given Approximate Inference, Decoding, and Model Structure , 2011, AISTATS.
[23] Hugo Larochelle,et al. Loss-sensitive Training of Probabilistic Conditional Random Fields , 2011, ArXiv.
[24] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[25] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[26] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.
[27] Justin Domke,et al. Generic Methods for Optimization-Based Modeling , 2012, AISTATS.
[28] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[29] Patrick M. Pilarski,et al. Model-Free reinforcement learning with continuous action in practice , 2012, 2012 American Control Conference (ACC).
[30] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[31] Joelle Pineau,et al. Learning from Limited Demonstrations , 2013, NIPS.
[32] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[33] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[34] Yoshua Bengio,et al. End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results , 2014, ArXiv.
[35] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[36] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[37] Andrew McCallum,et al. Learning Dynamic Feature Selection for Fast Sequential Prediction , 2015, ACL.
[38] Quoc V. Le,et al. Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.
[39] Yoshua Bengio,et al. Task Loss Estimation for Sequence Prediction , 2015, ArXiv.
[40] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[41] Yoshua Bengio,et al. Attention-Based Models for Speech Recognition , 2015, NIPS.
[42] Rauf Izmailov,et al. Learning using privileged information: similarity control and knowledge transfer , 2015, J. Mach. Learn. Res..
[43] Quoc V. Le,et al. Listen, Attend and Spell , 2015, ArXiv.
[44] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[45] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[46] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[47] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[48] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.
[49] Slav Petrov,et al. Globally Normalized Transition-Based Neural Networks , 2016, ACL.
[50] Alexander M. Rush,et al. Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.
[51] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[52] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[53] Yang Liu,et al. Minimum Risk Training for Neural Machine Translation , 2015, ACL.
[54] Bernhard Schölkopf,et al. Unifying distillation and privileged information , 2015, ICLR.
[55] Joelle Pineau,et al. An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.