Qualitative reinforcement learning control

An attempt is made to develop a reinforcement learning controller for a system described in more abstract or behavioral terms than those addressed by most controllers. The learning experiments center on the behavior of a ball rolling on a track. The evolution is from prediction to control of the behavior. An attempt is also made to evaluate the experiments in order to think about learning and experimentation at higher levels. The abstract description in the ball system considered is provided by a qualitative behavior of the system, given certain state information and given certain knowledge. The knowledge used is described, and the necessity of solving the problem is explained. The knowledge description is cast as part of a hierarchical controller, and generalizations to higher forms of learning are proposed.<<ETX>>

[1]  J. A. Franklin,et al.  What is qualitative reasoning, and can we use it for control? , 1990, 29th IEEE Conference on Decision and Control.

[2]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[3]  B. D'Ambrosio,et al.  Qualitative Process Theory Using Linguistic Variables , 1989, Symbolic Computation.

[4]  J. A. Franklin,et al.  Refinement of robot motor skills through reinforcement learning , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.

[5]  Daniel G. Bobrow,et al.  Qualitative Reasoning about Physical Systems: An Introduction , 1984, Artif. Intell..

[6]  Benjamin Kuipers,et al.  The Composition of Heterogeneous Control Laws , 1991 .

[7]  A. Makarovic A qualitative way of solving the pole balancing problem , 1991 .

[8]  King-Sun Fu,et al.  Learning Control Systems-Review and Outlook , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[10]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[11]  Philippe Caloud Towards Continuous Process Supervision , 1987, IJCAI.

[12]  Benjamin Kuipers,et al.  The Composition of Heterogeneous Control Laws , 1991, 1991 American Control Conference.

[13]  J. Franklin Input space representation for refinement learning control , 1989, Proceedings. IEEE International Symposium on Intelligent Control 1989.

[14]  Johan de Kleer,et al.  Readings in qualitative reasoning about physical systems , 1990 .

[15]  Lloyd Jones On the choice of subgoals for learning control systems , 1967 .

[16]  J. A. Franklin,et al.  The perceiving robot: What does it see? What does it do? , 1990, Proceedings. 5th IEEE International Symposium on Intelligent Control 1990.

[17]  Mieczyslaw M. Kokar,et al.  Qualitative monitoring of time-varying physical systems , 1990, 29th IEEE Conference on Decision and Control.

[18]  Johan de Kleer Multiple Representations of Knowledge in a Mechanics Problem-Solver , 1977, IJCAI.

[19]  M. Kawato,et al.  Hierarchical neural network model for voluntary movement with application to robotics , 1988, IEEE Control Systems Magazine.

[20]  R. Sutton,et al.  Connectionist Learning for Control: An Overview , 1989 .

[21]  Vijaykumar Gullapalli,et al.  A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.

[22]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[23]  King-Sun Fu,et al.  Learning control systems--Review and outlook , 1970 .

[24]  Judy A. Franklin,et al.  Historical perspective and state of the art in connectionist learning control , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.

[25]  Brian C. Williams,et al.  Doing Time: Putting Qualitative Reasoning on Firmer Ground , 1986, AAAI.

[26]  Benjamin Kuipers,et al.  Qualitative Simulation , 1986, Artificial Intelligence.

[27]  Oliver G. Selfridge,et al.  Some new directions for adaptive control theory in robotics , 1990 .

[28]  Filson H. Glanz,et al.  Application of a General Learning Algorithm to the Control of Robotic Manipulators , 1987 .

[29]  Benjamin J. Kaipers,et al.  Qualitative Simulation , 1989, Artif. Intell..

[30]  Mieczyslaw M. Kokar Critical Hypersurfaces and the Quantity Space , 1987, AAAI.

[31]  Kenneth D. Forbus Qualitative Process Theory , 1984, Artif. Intell..

[32]  Brian C. William Temporal qualitative analysis: explaining how physical systems work , 1989 .

[33]  Kevin L. Moore A Reinforcement-Learning Neural Network for the Control of Nonlinear Systems , 1991, 1991 American Control Conference.