HYBRID TRAJECTORY PLANNING USING REINFORCEMENT AND BACKPROPAGATION THROUGH TIME TECHNIQUES

A novel approach for trajectory planning of a mobile robot is presented. The mobile robot is assumed to move in a two-dimensional workspace with continuous input from the surrounding environment. The input is a signal that reflects the distance and position of an obstacle momentarily. The first part consists of using a neural network to direct the robot that moves from some initial point to a given target with constant speed. The neural network uses an original approach of hybrid instantaneous reinforcement learning in addition to a long-term backpropagation through time learning. Both techniques complement each other by providing online and offline learning. The second stage is to test the learned neural network with different obstacles than the ones used in learning. The neural network should be capable of discovering a strategy to steer the robot. The robot is assumed to move with constant speed and zero acceleration; consequently, the neural network output is just the direction of motion. Future work may include robot dynamics such that the output of the neural network will not only be a direction but also amount of motion.

[1]  Robert Fullér,et al.  Neural Fuzzy Systems , 1995 .

[2]  M.H. Hassoun,et al.  Fundamentals of Artificial Neural Networks , 1996, Proceedings of the IEEE.

[3]  Jean-Claude Latombe,et al.  Robot Motion Planning: A Distributed Representation Approach , 1991, Int. J. Robotics Res..

[4]  Zexiang Li,et al.  Motion of two rigid bodies with rolling constraint , 1990, IEEE Trans. Robotics Autom..

[5]  Pierre Dauchez,et al.  World representation and path planning for a mobile robot , 1988, Robotica.

[6]  Chin-Teng Lin,et al.  Neural fuzzy systems , 1994 .

[7]  H. Sussmann,et al.  Limits of highly oscillatory controls and the approximation of general paths by admissible trajectories , 1991, [1991] Proceedings of the 30th IEEE Conference on Decision and Control.

[8]  Ming-Chuan Leu,et al.  Optimal trajectory generation for robotic manipulators using dynamic programming , 1987 .

[9]  M. Hirsch,et al.  Differential Equations, Dynamical Systems, and Linear Algebra , 1974 .

[10]  Geoffrey E. Hinton,et al.  How Learning Can Guide Evolution , 1996, Complex Syst..

[11]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[12]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[13]  P. Anandan,et al.  Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[14]  Richard S. Sutton,et al.  The Truck Backer-Upper: An Example of Self-Learning in Neural Networks , 1995 .

[15]  W. E. Red,et al.  Configuration Maps for Robot Path Planning in Two Dimensions , 1985 .

[16]  Frank L. Lewis,et al.  Control of a nonholonomic mobile robot using neural networks , 1998, IEEE Trans. Neural Networks.

[17]  David B. Fogel,et al.  System Identification Through Simulated Evolution: A Machine Learning Approach to Modeling , 1991 .

[18]  R.A. Abu Zitar,et al.  Genetic and reinforcement-based rule extraction for regulator control , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[19]  Mohamad H. Hassoun,et al.  Neurocontrollers trained with rules extracted by a genetic assisted reinforcement learning system , 1995, IEEE Trans. Neural Networks.

[20]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[21]  Oussama Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1986 .

[22]  Oussama Khatib,et al.  Real-Time Obstacle Avoidance for Manipulators and Mobile Robots , 1985, Autonomous Robot Vehicles.

[23]  Mahmoud Tarokh,et al.  Adaptive fuzzy force control of manipulators with unknown environment parameters , 1997, J. Field Robotics.

[24]  Leo Dorst,et al.  Collision Avoidance and Path Finding through Constrained Distance Tranceformation in Robot State Space , 1986, IAS.

[25]  Richard M. Murray,et al.  A motion planner for nonholonomic mobile robots , 1994, IEEE Trans. Robotics Autom..