Reinforcement learning in continuous state- and action-space
暂无分享,去创建一个
[1] Michail G. Lagoudakis,et al. Learning continuous-action control policies , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.
[2] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[3] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[4] Jennie Si,et al. Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.
[5] Larry D. Pyeatt,et al. A comparison between cellular encoding and direct encoding for genetic neural networks , 1996 .
[6] Leemon C Baird,et al. Reinforcement Learning With High-Dimensional, Continuous Actions , 1993 .
[7] Jeffrey C. Lagarias,et al. Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions , 1998, SIAM J. Optim..
[8] Simon Haykin,et al. Neural Networks and Learning Machines , 2010 .
[9] Jonathan Baxter,et al. Reinforcement Learning From State and Temporal Differences , 1999 .
[10] Alex M. Andrew,et al. Reinforcement Learning: : An Introduction , 1998 .
[11] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[12] Dimitris C. Dracopoulos,et al. Genetic Programming for Generalised Helicopter Hovering Control , 2012, EuroGP.
[13] Dimitris C. Dracopoulos,et al. Swing Up and Balance Control of the Acrobot Solved by Genetic Programming , 2012, SGAI Conf..
[14] Mark W. Spong,et al. The swing up control problem for the Acrobot , 1995 .
[15] Dimitris C. Dracopoulos,et al. Application of Newton's Method to action selection in continuous state- and action-space reinforcement learning , 2014, ESANN.
[16] A. P. Wieland,et al. Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[17] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[18] Christian Igel,et al. Evolution Strategies for Direct Policy Search , 2008, PPSN.
[19] A. E. Eiben,et al. Introduction to Evolutionary Computing , 2003, Natural Computing Series.
[20] Risto Miikkulainen,et al. Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.
[21] Hiroshi Kinjo,et al. On the Continuous Control of the Acrobot via Computational Intelligence , 2009, IEA/AIE.
[22] Verena Heidrich-Meisner,et al. Neuroevolution strategies for episodic reinforcement learning , 2009, J. Algorithms.
[23] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[24] Mark W. Spong,et al. Swing up control of the Acrobot , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.
[25] Xin Xu,et al. Kernel-Based Least Squares Policy Iteration for Reinforcement Learning , 2007, IEEE Transactions on Neural Networks.
[26] Javier de Lope,et al. The kNN-TD Reinforcement Learning Algorithm , 2009 .
[27] Simon X. Yang,et al. Comprehensive Unified Control Strategy for Underactuated Two-Link Manipulators , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[28] John R. Koza,et al. Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.
[29] Sean Luke,et al. A Comparison of Bloat Control Methods for Genetic Programming , 2006, Evolutionary Computation.
[30] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[31] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[32] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[33] Charles W. Anderson,et al. Comparison of CMACs and radial basis functions for local function approximators in reinforcement learning , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).
[34] Gary Boone,et al. Minimum-time control of the Acrobot , 1997, Proceedings of International Conference on Robotics and Automation.
[35] Eiho Uezato,et al. Swing-up control of a 3-DOF acrobot using an evolutionary approach , 2009, Artificial Life and Robotics.
[36] Rémi Coulom,et al. High-accuracy value-function approximation with neural networks applied to the acrobot , 2004, ESANN.
[37] Peter Stone,et al. Empowerment for continuous agent—environment systems , 2011, Adapt. Behav..
[38] James S. Albus,et al. New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)1 , 1975 .
[39] John A. Nelder,et al. A Simplex Method for Function Minimization , 1965, Comput. J..
[40] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[41] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[42] Martin A. Riedmiller,et al. Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[43] Hado van Hasselt,et al. Reinforcement Learning in Continuous State and Action Spaces , 2012, Reinforcement Learning.
[44] Gary Boone,et al. Efficient reinforcement learning: model-based Acrobot control , 1997, Proceedings of International Conference on Robotics and Automation.
[45] Junichiro Yoshimoto,et al. Acrobot control by learning the switching of multiple controllers , 2005, Artificial Life and Robotics.
[46] Dimitris C. Dracopoulos,et al. Genetic programming as a solver to challenging reinforcement learning problems , 2013 .
[47] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[48] Hiroshi Kinjo,et al. A switch controller design for the acrobot using neural network and genetic algorithm , 2008, 2008 10th International Conference on Control, Automation, Robotics and Vision.
[49] Laurene V. Fausett,et al. Fundamentals Of Neural Networks , 1994 .
[50] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[51] Dominique Bonvin,et al. Quotient method for controlling the acrobot , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[52] R. Bellman. Dynamic programming. , 1957, Science.
[53] B.M. Wilamowski,et al. Neural network architectures and learning algorithms , 2009, IEEE Industrial Electronics Magazine.
[54] Warren B. Powell,et al. Approximate Dynamic Programming - Solving the Curses of Dimensionality , 2007 .
[55] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[56] M.A. Wiering,et al. Reinforcement Learning in Continuous Action Spaces , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[57] Matthijs T. J. Spaan,et al. Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.
[58] Riccardo Poli,et al. A Field Guide to Genetic Programming , 2008 .
[59] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[60] H. Martín,et al. Ex〈α〉: An effective algorithm for continuous actions Reinforcement Learning problems , 2009 .
[61] Christian Igel,et al. Reinforcement learning in a nutshell , 2007, ESANN.
[62] Stephan K. Chalup,et al. A small spiking neural network with LQR control applied to the acrobot , 2008, Neural Computing and Applications.
[63] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..