Actor-Critic Reinforcement Learning with Energy-Based Policies
暂无分享,去创建一个
Yee Whye Teh | David Silver | Nicolas Heess | D. Silver | N. Heess | Y. Teh | David Silver
[1] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.
[2] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[3] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[4] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[5] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[6] Geoffrey E. Hinton,et al. Reinforcement learning for factored Markov decision processes , 2002 .
[7] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.
[8] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[9] Geoffrey E. Hinton,et al. Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..
[10] Peter Szabó,et al. Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods , 2005, NIPS.
[11] Fu Jie Huang,et al. A Tutorial on Energy-Based Learning , 2006 .
[12] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.
[13] Gökhan BakIr,et al. Predicting Structured Data , 2008 .
[14] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[15] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.
[16] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[17] Geoffrey E. Hinton,et al. Factored conditional restricted Boltzmann Machines for modeling motion style , 2009, ICML '09.
[18] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[19] Kenji Doya,et al. Free-Energy Based Reinforcement Learning for Vision-Based Navigation with High-Dimensional Sensory Inputs , 2010, ICONIP.
[20] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.
[21] Junichiro Yoshimoto,et al. Free-energy-based reinforcement learning in a partially observable environment , 2010, ESANN.
[22] Geoffrey E. Hinton,et al. Conditional Restricted Boltzmann Machines for Structured Output Prediction , 2011, UAI.