Efficient Non-Linear Control by Combining Q-learning with Local Linear Controllers
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] Richard S. Sutton,et al. Reinforcement learning architectures for animats , 1991 .
[3] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[4] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[5] Vijaykumar Gullapalli,et al. Reinforcement learning and its application to control , 1992 .
[6] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[7] Long Ji Lin,et al. Scaling Up Reinforcement Learning for Robot Control , 1993, International Conference on Machine Learning.
[8] Leslie Pack Kaelbling,et al. Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.
[9] Peter D. Lawrence,et al. Transition Point Dynamic Programming , 1993, NIPS.
[10] Shigenobu Kobayashi,et al. Reinforcement Learning by Stochastic Hill Climbing on Discounted Reward , 1995, ICML.
[11] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[12] Chin-Teng Lin,et al. Reinforcement learning for an ART-based fuzzy adaptive learning control network , 1996, IEEE Trans. Neural Networks.
[13] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[14] Scott Davies,et al. Multidimensional Triangulation and Interpolation for Reinforcement Learning , 1996, NIPS.
[15] Kenji Doya,et al. Efficient Nonlinear Control with Actor-Tutor Architecture , 1996, NIPS.
[16] Matthias Heger. The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks , 1996, Machine Learning.
[17] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[18] Stephan Pareigis,et al. Adaptive Choice of Grid and Time in Reinforcement Learning , 1997, NIPS.
[19] Andrew W. Moore,et al. Applying Online Search Techniques to Continuous-State Reinforcement Learning , 1998, AAAI/IAAI.
[20] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.
[21] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.
[22] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[23] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[24] Satinder Singh,et al. An upper bound on the loss from approximate optimal-value functions , 1994, Machine Learning.
[25] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.