RETRACTED ARTICLE: A hierarchical learning architecture with multiple-goal representations and multiple timescale based on approximate dynamic programming
暂无分享,去创建一个
Shuai Li | Sanfeng Chen | Bo Liu | Yongsheng Liang | Yuesheng Lou | Shuai-jun Li | Bo Liu | Y. Lou | Sanfeng Chen | Yongsheng Liang
[1] J. Hawkins,et al. Why Can't a Computer be more Like a Brain? , 2007, IEEE Spectrum.
[2] Long Lin,et al. Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .
[3] Frank L. Lewis,et al. Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.
[4] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[5] Jette Randløv,et al. Shaping in Reinforcement Learning by Changing the Physics of the Problem , 2000, ICML.
[6] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[7] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[8] W. B. Gail. Climate control , 2007 .
[9] Robert Kozma,et al. Beyond Feedforward Models Trained by Backpropagation: A Practical Training Tool for a More Efficient Universal Approximator , 2007, IEEE Transactions on Neural Networks.
[10] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..
[11] John N. Tsitsiklis,et al. Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.
[12] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[13] Jude W. Shavlik,et al. Creating advice-taking reinforcement learners , 2004, Machine Learning.
[14] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[15] Paul J. Werbos,et al. 2009 Special Issue: Intelligence in the brain: A theory of how it works and how to build it , 2009 .
[16] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[17] Kevin P. Murphy,et al. A Survey of POMDP Solution Techniques , 2000 .
[18] Haibo He,et al. Two-time-scale online actor-critic paradigm driven by POMDP , 2010, 2010 International Conference on Networking, Sensing and Control (ICNSC).
[19] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[20] Leonid M. Fridman,et al. Generating Self-Excited Oscillations via Two-Relay Controller , 2009, IEEE Transactions on Automatic Control.
[21] E. Eweda,et al. Convergence of an adaptive linear estimation algorithm , 1984 .
[22] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[23] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[24] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[25] Jennie Si,et al. Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.
[26] Milos Hauskrecht,et al. Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..
[27] Sheng Chen,et al. Adaptive Dual Network Design for a Class of SIMO Systems with Nonlinear Time-variant Uncertainties: Adaptive Dual Network Design for a Class of SIMO Systems with Nonlinear Time-variant Uncertainties , 2010 .
[28] Miroslav Krstic,et al. Nonlinear and adaptive control de-sign , 1995 .
[29] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[30] R. Bellman. Dynamic programming. , 1957, Science.
[31] P.J. Werbos,et al. Using ADP to Understand and Replicate Brain Intelligence: the Next Level Design , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.
[32] C.W. Anderson,et al. Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.
[33] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .
[34] Warren B. Powell,et al. Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.
[35] Ethem Alpaydin,et al. Introduction to machine learning , 2004, Adaptive computation and machine learning.
[36] Bo Liu,et al. Basis Construction from Power Series Expansions of Value Functions , 2010, NIPS.
[37] Yajin Zhou,et al. SP-NN: A novel neural network approach for path planning , 2007, 2007 IEEE International Conference on Robotics and Biomimetics (ROBIO).
[38] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[39] Haibo He,et al. A hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming , 2010, 2010 International Conference on Networking, Sensing and Control (ICNSC).
[40] J. Hawkins,et al. On Intelligence , 2004 .
[41] John N. Tsitsiklis,et al. Linear stochastic approximation driven by slowly varying Markov chains , 2003, Syst. Control. Lett..
[42] Bo Liu,et al. Adaptive Dual Network Design for a Class of SIMO Systems with Nonlinear Time-variant Uncertainties , 2010 .