Q-Error as a Selection Mechanism in Modular Reinforcement-Learning Systems
暂无分享,去创建一个
[1] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[2] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[3] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[5] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[6] Chris Watkins,et al. Learning from delayed rewards , 1989 .
[7] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[8] Stewart W. Wilson. Classifier Fitness Based on Accuracy , 1995, Evolutionary Computation.
[9] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[10] M. V. Rossum,et al. In Neural Computation , 2022 .
[11] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .
[12] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[13] Mitsuo Kawato,et al. Inter-module credit assignment in modular reinforcement learning , 2003, Neural Networks.
[14] Jonas Karlsson,et al. Learning via task decomposition , 1993 .
[15] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.
[16] D. L. Corgan,et al. King's College , 1867, British medical journal.
[17] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[18] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[19] John H. Holland,et al. Properties of the Bucket Brigade , 1985, ICGA.
[20] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[21] D M Wolpert,et al. Multiple paired forward and inverse models for motor control , 1998, Neural Networks.
[22] Satinder Singh. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..
[23] D. Signorini,et al. Neural networks , 1995, The Lancet.
[24] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[25] Mario Tokoro,et al. An Adaptive Architecture for Modular Q-Learning , 1997, IJCAI.