论文信息 - An Evolutionary Approach to Automatic Construction of the Structure in Hierarchical Reinforcement Learning

An Evolutionary Approach to Automatic Construction of the Structure in Hierarchical Reinforcement Learning

We plan to implement our method to the real hardware. A foreseeable extension of this study is to generalize the method as a model of cooperative and competitive mechanisms of the learning modules in the brain.

[1] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[2] Anders Eriksson TRITA-NA-Eyynn. Evolution of Meta-parameters in Reinforcement Learning , 2003 .

[3] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4] John R. Koza,et al. Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[5] Martin C. Martin,et al. Visual obstacle avoidance using genetic programming: first results , 2001 .

[6] M. B. Tahir,et al. An overview of Genetic Algorithms , 2003 .

[7] K. Downing. Adaptive genetic programs via reinforcement learning , 2001 .

[8] David J. Montana,et al. Strongly Typed Genetic Programming , 1995, Evolutionary Computation.

[9] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[10] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[11] Melanie Mitchell,et al. An introduction to genetic algorithms , 1996 .

[12] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.

[13] Minoru Asada,et al. Cooperative and competitive behavior acquisition for mobile robots through co-evolution , 1999 .

[14] Peter Nordin,et al. An On-Line Method to Evolve Behavior and to Control a Miniature Robot in Real Time with Genetic Programming , 1996, Adapt. Behav..

[15] Gregory S. Hornby,et al. Autonomous evolution of gaits with the Sony Quadruped Robot , 1999 .

[16] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[17] J. Galletly. An Overview of Genetic Algorithms , 1992 .

[18] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.

[19] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[20] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[21] Thomas Bäck,et al. Evolutionary computation: an overview , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.