Basis Expansion in Natural Actor Critic Methods
暂无分享,去创建一个
[1] Lihong Li,et al. Analyzing feature generation for value-function approximation , 2007, ICML '07.
[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[3] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[4] Shalabh Bhatnagar,et al. Natural actorcritic algorithms. , 2009 .
[5] Christian Lebiere,et al. The Cascade-Correlation Learning Architecture , 1989, NIPS.
[6] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[7] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[8] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[9] Sridhar Mahadevan,et al. Representation Policy Iteration , 2005, UAI.
[10] Sridhar Mahadevan,et al. Constructing basis functions from directed graphs for value function approximation , 2007, ICML '07.
[11] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[12] Isabelle Guyon,et al. An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..
[13] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[14] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..
[15] Martin A. Riedmiller,et al. Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[16] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[17] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[18] Doina Precup,et al. Combining TD-learning with Cascade-correlation Networks , 2003, ICML.
[19] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[20] Shie Mannor,et al. Automatic basis function construction for approximate dynamic programming and reinforcement learning , 2006, ICML.
[21] Shalabh Bhatnagar,et al. Incremental Natural Actor-Critic Algorithms , 2007, NIPS.
[22] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[23] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.