Reinforcement learning by GA using importance sampling
暂无分享,去创建一个
Shigenobu Kobayashi | Jun Sakuma | Hajime Kimura | Chikao Tsuchiya | H. Kimura | Shigenobu Kobayashi | J. Sakuma | Chikao Tsuchiya | S. Kobayashi
[1] Shigenobu Kobayashi,et al. Reinforcement learning of walking behavior for a four-legged robot , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).
[2] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[3] S. Kobayashi,et al. Theoretical analysis of the unimodal normal distribution crossover for real-coded genetic algorithms , 1998, 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360).
[4] Risto Miikkulainen,et al. Confidence Based Dual Reinforcement Q-Routing: An adaptive online network routing algorithm , 1999, IJCAI.
[5] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[6] R. Bellman. Dynamic programming. , 1957, Science.
[7] Shigenobu Kobayashi,et al. Reinforcement Learning in POMDPs with Function Approximation , 1997, ICML.
[8] Hiroaki Satoh,et al. Minimal generation gap model for GAs considering both exploration and exploitation , 1996 .
[9] Isao Ono,et al. A Real Coded Genetic Algorithm for Function Optimization Using Unimodal Normal Distributed Crossover , 1997, ICGA.
[10] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[11] Sanjoy Dasgupta,et al. Off-Policy Temporal Difference Learning with Function Approximation , 2001, ICML.
[12] Christian R. Shelton,et al. Policy Improvement for POMDPs Using Normalized Importance Sampling , 2001, UAI.
[13] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[14] Michael L. Littman,et al. Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.
[15] Leonid Peshkin,et al. Learning from Scarce Experience , 2002, ICML.