Why Experimentation can be better than "Perfect Guidance"

The full version of this paper appeared at ICML-97. Many problems correspond to the classical control task of determining the appropriate control action to take, given some (sequence of) observations. One standard approach to learning these control rules, called behavior cloning, involves watching a perfect operator operate a plant, and then trying to emulate its behavior. In the experimental learning approach, by contrast, the learner rst guesses an initial operation-to-action policy and tries it out. If this policy performs sub-optimally, the learner can modify it to produce a new policy, and recur. This paper discusses the relative eeectiveness of these two approaches, especially in the presence of perceptual aliasing, showing in particular that the experimental learner can often learn more eeectively than the cloning one.

[1]  Fritz Wysotzki,et al.  Automatic Synthesis of Control Programs by Combination of Learning and Problem Solving Methods (Extended Abstract) , 1995, ECML.

[2]  Naresh K. Sinha,et al.  Modern Control Systems , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[3]  Peter Norvig,et al.  A modern, agent-oriented approach to introductory artificial intelligence , 1995, SGAR.

[4]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .

[5]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[6]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1989, [1989] Proceedings. Structure in Complexity Theory Fourth Annual Conference.

[7]  Richard S. Sutton,et al.  Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[8]  Dana H. Ballard,et al.  Learning to Perceive and Act , 1990 .

[9]  I. Rechenberg Artificial evolution and artificial intelligence , 1988 .

[10]  Richard C. Dorf,et al.  Modern Control Systems, 7th edition , 1995 .

[11]  Dean A. Pomerleau,et al.  Neural Network Perception for Mobile Robot Guidance , 1993 .

[12]  Claude Sammut,et al.  Learning to Fly , 1992, ML.

[13]  Tobias Scheffer,et al.  Algebraic foundations and improved methods of induction or ripple-down rules , 1996 .

[14]  Susan L. Epstein Toward an Ideal Trainer , 1994 .

[15]  Tom M. Mitchell,et al.  Models of Learning Systems. , 1979 .

[16]  Martin Brooks,et al.  Proposal for a Pattern Matching Task Controller for Sensor-Based Coordination of Robot Motions , 1993 .

[17]  A. Carlisle Scott,et al.  Practical guide to knowledge acquisition , 1991 .

[18]  Tomaso A. Poggio,et al.  Example-Based Learning for View-Based Human Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..