Abstraction in Control Learning

ARRAY(0x8475700) to induce abstract classes of state tasks. The second approach seeks to learn such state classes by constructing hierarchical connectionist networks whose units act as abstract features or concepts. Both approaches are designed to facilitate control over memory resources, allowing learning to accelerate from early rote memorization to more globally-scaled generalization.

[1]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[2]  Ronald A. Howard,et al.  Dynamic Programming and Markov Processes , 1960 .

[3]  Richard Bellman,et al.  Introduction to the mathematical theory of control processes , 1967 .

[4]  A. L. Samuel,et al.  Some studies in machine learning using the game of checkers. II: recent progress , 1967 .

[5]  Marvin Minsky,et al.  Perceptrons: An Introduction to Computational Geometry , 1969 .

[6]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[7]  James S. Albus,et al.  I A New Approach to Manipulator Control: The I Cerebellar Model Articulation Controller , 1975 .

[8]  James S. Albus,et al.  Data Storage in the Cerebellar Model Articulation Controller (CMAC) , 1975 .

[9]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[10]  Richard Waldinger,et al.  Achieving several goals simultaneously , 1977 .

[11]  Temple F. Smith Occam's razor , 1980, Nature.

[12]  Steven A. Vere,et al.  Multilevel Counterfactuals for Generalizations of Relational Concepts and Productions , 1980, Artif. Intell..

[13]  Thomas G. Dietterich,et al.  Learning and Inductive Inference , 1982 .

[14]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[15]  Richard S. Sutton,et al.  Temporal credit assignment in reinforcement learning , 1984 .

[16]  Larry A. Rendell,et al.  Substantial Constructive Induction Using Layered Information Compression: Tractable Feature Formation in Search , 1985, IJCAI.

[17]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[18]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[19]  P. Ut Goff,et al.  Machine learning of inductive bias , 1986 .

[20]  Ronald L. Rivest,et al.  Diversity-based inference of finite automata , 1994, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[21]  Robert E. Schapire,et al.  A new approach to unsupervised learning in deterministic environments , 1990 .

[22]  Dimitri P. Bertsekas,et al.  Dynamic Programming: Deterministic and Stochastic Models , 1987 .

[23]  P. W. Jones,et al.  Bandit Problems, Sequential Allocation of Experiments , 1987 .

[24]  Stephen M. Omohundro,et al.  Efficient Algorithms with Neural Network Behavior , 1987, Complex Syst..

[25]  P. Smolensky On the proper treatment of connectionism , 1988, Behavioral and Brain Sciences.

[26]  Michael C. Mozer,et al.  Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.

[27]  John E. Moody,et al.  Fast Learning in Multi-Resolution Hierarchies , 1988, NIPS.

[28]  Pentti Kanerva,et al.  Sparse Distributed Memory , 1988 .

[29]  Oren Etzioni,et al.  Explanation-Based Learning: A Problem Solving Perspective , 1989, Artif. Intell..

[30]  R. Sutton,et al.  Connectionist Learning for Control: An Overview , 1989 .

[31]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[32]  A. Barto,et al.  Learning and Sequential Decision Making , 1989 .

[33]  Michael I. Jordan,et al.  Learning to Control an Unstable System with Forward Modeling , 1989, NIPS.

[34]  Geoffrey E. Hinton Connectionist Learning Procedures , 1989, Artif. Intell..

[35]  David J. Reinkensmeyer,et al.  Using associative content-addressable memories to control robots , 1989, Proceedings, 1989 International Conference on Robotics and Automation.

[36]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[37]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[38]  Peter Dayan,et al.  Navigating Through Temporal Difference , 1990, NIPS.

[39]  Paul E. Utgoff,et al.  Explaining Temporal Differences to Create Useful Concepts for Evaluating States , 1990, AAAI.

[40]  Geoffrey E. Hinton Preface to the Special Issue on Connectionist Symbol Processing , 1990 .

[41]  Andrew W. Moore,et al.  Efficient memory-based learning for robot control , 1990 .

[42]  Tomaso A. Poggio,et al.  Extensions of a Theory of Networks for Approximation and Learning , 1990, NIPS.

[43]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[44]  Marcus R. Frean,et al.  Small nets and short paths : optimising neural computation , 1990 .

[45]  Marcus Frean,et al.  The Upstart Algorithm: A Method for Constructing and Training Feedforward Neural Networks , 1990, Neural Computation.

[46]  David E. Rumelhart,et al.  Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.

[47]  Jonathan Bachrach,et al.  A Connectionist Learning Control Architecture for Navigation , 1990, NIPS.

[48]  Christopher J. Matheus,et al.  The Need for Constructive Induction , 1991, ML.

[49]  Andrew G. Barto,et al.  On the Computational Economics of Reinforcement Learning , 1991 .

[50]  J. Urgen Schmidhuber,et al.  Adaptive confidence and adaptive curiosity , 1991, Forschungsberichte, TU Munich.

[51]  Jürgen Schmidhuber,et al.  A possibility for implementing curiosity and boredom in model-building neural controllers , 1991 .

[52]  Stephen H. Lane,et al.  Higher-Order CMAC Neural Networks - Theory and Practice , 1991, 1991 American Control Conference.

[53]  V. Gullapalli,et al.  A comparison of supervised and reinforcement learning methods on a reinforcement learning task , 1991, Proceedings of the 1991 IEEE International Symposium on Intelligent Control.

[54]  Sebastian Thrun,et al.  Active Exploration in Dynamic Environments , 1991, NIPS.

[55]  Richard S. Sutton,et al.  Reinforcement learning architectures for animats , 1991 .

[56]  Satinder P. Singh,et al.  Transfer of Learning Across Compositions of Sequentail Tasks , 1991, ML.

[57]  Michael I. Jordan,et al.  Task Decomposition Through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks , 1990, Cogn. Sci..

[58]  C. Jutten,et al.  Gal: Networks That Grow When They Learn and Shrink When They Forget , 1991 .

[59]  Andrew W. Moore,et al.  Knowledge of knowledge and intelligent experimentation for learning control , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[60]  Michael P. Wellman,et al.  Planning and Control , 1991 .

[61]  Michael I. Jordan,et al.  Forward Models: Supervised Learning with a Distal Teacher , 1992, Cogn. Sci..

[62]  D. Sofge THE ROLE OF EXPLORATION IN LEARNING CONTROL , 1992 .

[63]  Richard S. Sutton,et al.  Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.

[64]  Geoffrey E. Hinton,et al.  Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.

[65]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .