Beyond adaptive-critic creative learning for intelligent mobile robots

Intelligent industrial and mobile robots may be considered proven technology in structured environments. Teach programming and supervised learning methods permit solutions to a variety of applications. However, we believe that to extend the operation of these machines to more unstructured environments requires a new learning method. Both unsupervised learning and reinforcement learning are potential candidates for these new tasks. The adaptive critic method has been shown to provide useful approximations or even optimal control policies to non-linear systems. The purpose of this paper is to explore the use of new learning methods that goes beyond the adaptive critic method for unstructured environments. The adaptive critic is a form of reinforcement learning. A critic element provides only high level grading corrections to a cognition module that controls the action module. In the proposed system the critic's grades are modeled and forecasted, so that an anticipated set of sub-grades are available to the cognition model. The forecasting grades are interpolated and are available on the time scale needed by the action model. The success of the system is highly dependent on the accuracy of the forecasted grades and adaptability of the action module. Examples from the guidance of a mobile robot are provided to illustrate the method for simple line following and for the more complex navigation and control in an unstructured environment. The theory presented that is beyond the adaptive critic may be called creative theory. Creative theory is a form of learning that models the highest level of human learning - imagination. The application of the creative theory appears to not only be to mobile robots but also to many other forms of human endeavor such as educational learning and business forecasting. Reinforcement learning such as the adaptive critic may be applied to known problems to aid in the discovery of their solutions. The significance of creative theory is that it permits the discovery of the unknown problems, ones that are not yet recognized but may be critical to survival or success.

[1]  Norio Baba,et al.  A new approach for finding the global minimum of error function of neural networks , 1989, Neural Networks.

[2]  Frank L. Lewis,et al.  Neural net robot controller with guaranteed tracking performance , 1995, IEEE Trans. Neural Networks.

[3]  Michael Kuperstein,et al.  Neural controller for adaptive movements with unforeseen payloads , 1990, IEEE Trans. Neural Networks.

[4]  P.J. Werbos,et al.  An overview of neural networks for control , 1991, IEEE Control Systems.

[5]  Ernest L. Hall,et al.  Robotics, A User-Friendly Introduction , 1985 .

[6]  Clifford Lau,et al.  Neural Networks: Theoretical Foundations and Analysis , 1991 .

[7]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[8]  P. Frank,et al.  Boston Studies in the Philosophy of Science , 1968 .

[9]  G. Josin,et al.  Robot control using neural networks , 1988, IEEE 1988 International Conference on Neural Networks.

[10]  Craig T. Harston,et al.  Application of neural networks to robotics , 1990 .

[11]  B. Kuchen,et al.  Stability analysis of neural networks based adaptive controllers for robot manipulators , 1994, Proceedings of 1994 American Control Conference - ACC '94.

[12]  Bernard Widrow,et al.  30 years of adaptive neural networks: perceptron, Madaline, and backpropagation , 1990, Proc. IEEE.

[13]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[14]  Mitsuo Kawato,et al.  Feedback-error-learning neural network for trajectory control of a robotic manipulator , 1988, Neural Networks.

[15]  A. Koivo Fundamentals for Control of Robotic Manipulators , 1989 .

[16]  Michael Chester,et al.  Neural networks - a tutorial , 1993 .

[17]  Richard D. Braatz,et al.  On the "Identification and control of dynamical systems using neural networks" , 1997, IEEE Trans. Neural Networks.

[18]  Fernando J. Pineda,et al.  Recurrent Backpropagation and the Dynamical Approach to Adaptive Neural Computation , 1989, Neural Computation.

[19]  Takayuki Yamada,et al.  Neural network controller characteristics with regard to adaptive control , 1992, IEEE Trans. Syst. Man Cybern..

[20]  Robert A. Jacobs,et al.  Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.

[21]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[22]  Bernard Widrow,et al.  Punish/Reward: Learning with a Critic in Adaptive Threshold Systems , 1973, IEEE Trans. Syst. Man Cybern..

[23]  Eric R. Ziegel,et al.  Understanding Neural Networks , 1980 .

[24]  Yangsheng Xu,et al.  Real-time implementation of neural network learning control of a flexible Space manipulator , 1993, [1993] Proceedings IEEE International Conference on Robotics and Automation.

[25]  Ernest L. Hall,et al.  Robot control using neural networks with adaptive learning steps , 1992, Other Conferences.

[26]  F.-C. Chen,et al.  Back-propagation neural networks for nonlinear self-tuning adaptive control , 1990, IEEE Control Systems Magazine.

[27]  Mitsuo Kawato,et al.  Feedback-Error-Learning Neural Network for Supervised Motor Learning , 1990 .

[28]  Barak A. Pearlmutter,et al.  Using Backpropagation with Temporal Windows to Learn the Dynamics of the CMU Direct-Drive Arm II , 1988, NIPS.

[29]  Teuvo Kohonen,et al.  An introduction to neural computing , 1988, Neural Networks.

[30]  A. Sideris,et al.  A multilayered neural network controller , 1988, IEEE Control Systems Magazine.

[31]  Jenq-Neng Hwang,et al.  Neural network architectures for robotic applications , 1989, IEEE Trans. Robotics Autom..

[32]  Stephen Grossberg,et al.  Studies of mind and brain , 1982 .

[33]  F. Attneave,et al.  The Organization of Behavior: A Neuropsychological Theory , 1949 .

[34]  S. Grossberg Neural Networks and Natural Intelligence , 1988 .

[35]  Richard P. Lippmann,et al.  An introduction to computing with neural nets , 1987 .

[36]  P. J. Werbos Optimal neurocontrol: practical benefits, new results and biological evidence , 1995, Proceedings of WESCON'95.

[37]  J. Knott The organization of behavior: A neuropsychological theory , 1951 .

[38]  V. Vemuri,et al.  Artificial neural networks: an introduction , 1988 .

[39]  A. Guez,et al.  A trainable neuromorphic controller , 1988 .

[40]  P. J. Werbos,et al.  Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.

[41]  Paul J. Werbos,et al.  New directions in ACDs: keys to intelligent control and understanding the brain , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[42]  Maureen Caudill,et al.  Understanding Neural Networks; Computer Explorations , 1992 .