Skillful control under uncertainty via direct reinforcement learning

Complexity and uncertainty in modern robots and other autonomous systems make it difficult to design controllers for such systems that can achieve desired levels of precision and robustness. Therefore learning methods are being incorporated into controllers for such systems, thereby providing the adaptibility necessary to meet the performance demands of the task. We argue that for learning tasks arising frequently in control applications, the most useful methods in practice probably are those we call direct associative reinforcement learning methods. We describe direct reinforcement learning methods and also illustrate with an example the utility of these methods for learning skilled robot control under uncertainty.

[1]  Roderic A. Grupen,et al.  Learning admittance mappings for force-guided assembly , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[2]  Andrew G. Barto,et al.  Neural Networks and Adaptive Control , 1993 .

[3]  Suguru Arimoto,et al.  Learning Task Strategies in Robotic Assembly Systems , 1992, Robotica.

[4]  Richard S. Sutton,et al.  Neural networks for control , 1990 .

[5]  Tanneguy Redarce,et al.  Robotic Assembly by Slight Random Movements , 1991, Robotica.

[6]  Roderic A. Grupen,et al.  Learning reactive admittance control , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[7]  R. E. Gustavson,et al.  A Theory for the Three-Dimensional Mating of Chamfered Cylindrical Parts , 1985 .

[8]  Armand M. de Callataÿ,et al.  Natural and Artificial Intelligence: Processor Systems Compared to the Human Brain , 1986 .

[9]  Vijaykumar Gullapalli,et al.  Reinforcement learning and its application to control , 1992 .

[10]  P. Anandan,et al.  Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Kumpati S. Narendra,et al.  Learning automata - an introduction , 1989 .

[12]  Steven Jeffrey Gordon Automated assembly using feature localization , 1987 .

[13]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[14]  Daniel E. Whitney,et al.  Quasi-Static Assembly of Compliantly Supported Rigid Parts , 1982 .

[15]  Graham C. Goodwin,et al.  Adaptive filtering prediction and control , 1984 .

[16]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[17]  Kumpati S. Narendra,et al.  Adaptive control using neural networks , 1990 .