Exploring parameter space in reinforcement learning
暂无分享,去创建一个
Tom Schaul | Yi Sun | Frank Sehnke | Jürgen Schmidhuber | Daan Wierstra | Thomas Rückstieß | J. Schmidhuber | T. Schaul | Daan Wierstra | Yi Sun | Thomas Rückstieß | Frank Sehnke
[1] W. Pinebrook. The evolution of strategy. , 1990, Case studies in health administration.
[2] Michael I. Jordan. Attractor dynamics and parallelism in a connectionist sequential machine , 1990 .
[3] Sebastian Thrun,et al. The role of exploration in learning control , 1992 .
[4] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[5] Donald A. Sofge,et al. Handbook of Intelligent Control: Neural, Fuzzy, and Adaptive Approaches , 1992 .
[6] James Kennedy,et al. Particle swarm optimization , 2002, Proceedings of ICNN'95 - International Conference on Neural Networks.
[7] Rainer Storn,et al. Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..
[8] Stewart W. Wilson,et al. From Animals to Animats 5. Proceedings of the Fifth International Conference on Simulation of Adaptive Behavior , 1997 .
[9] Shun-ichi Amari,et al. Why natural gradient? , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[10] Jürgen Schmidhuber,et al. Efficient model-based exploration , 1998 .
[11] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[12] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[13] J. A. Lozano,et al. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .
[14] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[15] Pedro Larrañaga,et al. Estimation of Distribution Algorithms , 2002, Genetic Algorithms and Evolutionary Computation.
[16] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.
[17] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.
[18] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[19] Pat Langley,et al. Editorial: On Machine Learning , 1986, Machine Learning.
[20] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[21] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[22] Petros Koumoutsakos,et al. Learning Probability Distributions in Continuous Evolutionary Algorithms - a Comparative Review , 2004, Nat. Comput..
[23] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[24] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[25] Rémi Munos,et al. Policy Gradient in Continuous Time , 2006, J. Mach. Learn. Res..
[26] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[27] Stefan Schaal,et al. Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[28] Martin Lauer,et al. Making a Robot Learn to Play Soccer Using Reward and Punishment , 2007, KI.
[29] Riccardo Poli,et al. Particle swarm optimization , 1995, Swarm Intelligence.
[30] M.A. Wiering,et al. Reinforcement Learning in Continuous Action Spaces , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[31] Martin A. Riedmiller,et al. Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[32] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[33] Frank Sehnke,et al. Policy Gradients with Parameter-Based Exploration for Control , 2008, ICANN.
[34] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[35] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[36] Tom Schaul,et al. Stochastic search using the natural gradient , 2009, ICML '09.
[37] D. E. Ivanov. Institute of Applied Mathematics and Mechanics NAS of Ukraine, Donetsk PARALLEL FAULT SIMULATION ON MULTI-CORE PROCESSORS , 2009 .
[38] Frank Sehnke,et al. Multimodal Parameter-exploring Policy Gradients , 2010, 2010 Ninth International Conference on Machine Learning and Applications.
[39] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[40] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.