RETRACTED ARTICLE: A hierarchical learning architecture with multiple-goal representations and multiple timescale based on approximate dynamic programming

This paper has been retracted with the agreement of the authors and the Editor-in-Chief due to concerns over incorrect references and unacknowledged authorship.

[1]  J. Hawkins,et al.  Why Can't a Computer be more Like a Brain? , 2007, IEEE Spectrum.

[2]  Long Lin,et al.  Memory Approaches to Reinforcement Learning in Non-Markovian Domains , 1992 .

[3]  Frank L. Lewis,et al.  Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem , 2009, 2009 International Joint Conference on Neural Networks.

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Jette Randløv,et al.  Shaping in Reinforcement Learning by Changing the Physics of the Problem , 2000, ICML.

[6]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[7]  Michael I. Jordan,et al.  Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.

[8]  W. B. Gail Climate control , 2007 .

[9]  Robert Kozma,et al.  Beyond Feedforward Models Trained by Backpropagation: A Practical Training Tool for a More Efficient Universal Approximator , 2007, IEEE Transactions on Neural Networks.

[10]  Marco Colombetti,et al.  Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..

[11]  John N. Tsitsiklis,et al.  Distributed Asynchronous Deterministic and Stochastic Gradient Optimization Algorithms , 1984, 1984 American Control Conference.

[12]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[13]  Jude W. Shavlik,et al.  Creating advice-taking reinforcement learners , 2004, Machine Learning.

[14]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[15]  Paul J. Werbos,et al.  2009 Special Issue: Intelligence in the brain: A theory of how it works and how to build it , 2009 .

[16]  Preben Alstrøm,et al.  Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.

[17]  Kevin P. Murphy,et al.  A Survey of POMDP Solution Techniques , 2000 .

[18]  Haibo He,et al.  Two-time-scale online actor-critic paradigm driven by POMDP , 2010, 2010 International Conference on Networking, Sensing and Control (ICNSC).

[19]  W. Lovejoy A survey of algorithmic methods for partially observed Markov decision processes , 1991 .

[20]  Leonid M. Fridman,et al.  Generating Self-Excited Oscillations via Two-Relay Controller , 2009, IEEE Transactions on Automatic Control.

[21]  E. Eweda,et al.  Convergence of an adaptive linear estimation algorithm , 1984 .

[22]  Leslie Pack Kaelbling,et al.  Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.

[23]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[24]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[25]  Jennie Si,et al.  Online learning control by association and reinforcement. , 2001, IEEE transactions on neural networks.

[26]  Milos Hauskrecht,et al.  Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[27]  Sheng Chen,et al.  Adaptive Dual Network Design for a Class of SIMO Systems with Nonlinear Time-variant Uncertainties: Adaptive Dual Network Design for a Class of SIMO Systems with Nonlinear Time-variant Uncertainties , 2010 .

[28]  Miroslav Krstic,et al.  Nonlinear and adaptive control de-sign , 1995 .

[29]  Shalabh Bhatnagar,et al.  Natural actor-critic algorithms , 2009, Autom..

[30]  R. Bellman Dynamic programming. , 1957, Science.

[31]  P.J. Werbos,et al.  Using ADP to Understand and Replicate Brain Intelligence: the Next Level Design , 2007, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning.

[32]  C.W. Anderson,et al.  Learning to control an inverted pendulum using neural networks , 1989, IEEE Control Systems Magazine.

[33]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[34]  Warren B. Powell,et al.  Handbook of Learning and Approximate Dynamic Programming , 2006, IEEE Transactions on Automatic Control.

[35]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[36]  Bo Liu,et al.  Basis Construction from Power Series Expansions of Value Functions , 2010, NIPS.

[37]  Yajin Zhou,et al.  SP-NN: A novel neural network approach for path planning , 2007, 2007 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[38]  Michael I. Jordan,et al.  Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.

[39]  Haibo He,et al.  A hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming , 2010, 2010 International Conference on Networking, Sensing and Control (ICNSC).

[40]  J. Hawkins,et al.  On Intelligence , 2004 .

[41]  John N. Tsitsiklis,et al.  Linear stochastic approximation driven by slowly varying Markov chains , 2003, Syst. Control. Lett..

[42]  Bo Liu,et al.  Adaptive Dual Network Design for a Class of SIMO Systems with Nonlinear Time-variant Uncertainties , 2010 .