Towards planning: incremental investigations into adaptive robot control

Traditional models of planning have adopted a top-down perspective by focusing on the deliberative, conscious qualities of planning at the expense of having a system that is connected to the world through its perceptions. My thesis takes the opposing, bottom-up perspective that being firmly situated in the world is the crucial starting point to understanding planning. The central hypothesis of this thesis is that the ability to plan developed from the more primitive capacity of reactive control. Neural networks offer the most promising mechanism for investigating robot control and planning because connectionist methodology allows the task demands rather than the designer's biases to be the primary force in shaping a system's development. Input can come directly from the sensors and output can feed directly into the actuators creating a close coupling of perception and action. This interplay between sensing and acting fosters a dynamic interaction between the controller and its environment that is crucial to producing reactive behavior. Because adaptation is fundamental to the connectionist paradigm, the designer need not posit what form the internal knowledge will take or what specific function it will serve. Instead, based on the training task, the system will construct its own internal representations built directly from the sensor readings to achieve the desired control behavior. Once the system has reached an adequate level of performance at the task, its method can be dissected and a high-level understanding of its control principles can be determined. This thesis takes an incremental approach towards understanding planning. In the initial phase, several ways of representing goals are explored using a simulated robot in a one-dimensional environment. Next the model is extended to accommodate an actual physical robot and two reinforcement learning methods for adapting the network controllers are compared: a gradient descent algorithm and a genetic algorithm. Finally, the model's behavior and representations are analyzed to reveal that it contains the potential building blocks necessary for planning. By actively restricting the extent of our presuppositions about planning, we may be able to develop truly autonomous robots with radically different forms of control and planning.

[1]  Inman Harvey,et al.  Analysing recurrent dynamical networks evolved for robot control , 1993 .

[2]  Randall D. Beer,et al.  Evolving Dynamical Neural Networks for Adaptive Behavior , 1992, Adapt. Behav..

[3]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[4]  Jonathan Baxter,et al.  Learning internal representations , 1995, COLT '95.

[5]  Robert James Firby,et al.  Adaptive execution in complex dynamic worlds , 1989 .

[6]  Leslie Pack Kaelbling,et al.  Learning in embedded systems , 1993 .

[7]  V. Braitenberg Vehicles, Experiments in Synthetic Psychology , 1984 .

[8]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[9]  David H. Ackley,et al.  Generalization and Scaling in Reinforcement Learning , 1989, NIPS.

[10]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .

[11]  Philip E. Agre,et al.  The dynamic structure of everyday life , 1988 .

[12]  Harry G. Barrow Intelligence as Adaptive Behavior: An Experiment in Computational Neuroethology.Randall D. BeerNervous System Construction Kit. Version 2.0. Based on the Book, Intelligence as Adaptive Behavior: An Experiment in Computational Neuroethology.Pat Williams , Greg Williams , Gravel Switch , 1994 .

[13]  David J. Chalmers,et al.  The Evolution of Learning: An Experiment in Genetic Connectionism , 1991 .

[14]  Paul F. M. J. Verschure,et al.  Beyond Rationalism: Symbols, Patterns and Behavior , 1992 .

[15]  Inman Harvey,et al.  Evolutionary robotics and SAGA: The case for hill crawling and tournament selection , 1994 .

[16]  Andy Clark,et al.  Associative Engines: Connectionism, Concepts, and Representational Change , 1993 .

[17]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[18]  Donald A. Norman,et al.  Cognition in the Head and in the World: An Introduction to the Special Issue on Situated Action , 1993, Cogn. Sci..

[19]  Michael I. Jordan Serial Order: A Parallel Distributed Processing Approach , 1997 .

[20]  David Chapman,et al.  Planning for Conjunctive Goals , 1987, Artif. Intell..

[21]  R. Sutton,et al.  Connectionist Learning for Control: An Overview , 1989 .

[22]  Donald A. Norman,et al.  Approaches to the Study of Intelligence , 1991, Artif. Intell..

[23]  R. Beer,et al.  Intelligence as Adaptive Behavior: An Experiment in Computational Neuroethology , 1990 .

[24]  Allen Newell,et al.  Computer science as empirical inquiry: symbols and search , 1976, CACM.

[25]  Alan D Samples,et al.  SOAR Architecture , 1985 .

[26]  Randall D. Beer,et al.  Sequential Behavior and Learning in Evolved Dynamical Neural Networks , 1994, Adapt. Behav..

[27]  David L. Waltz Eight principles for building an intelligent robot , 1991 .

[28]  Douglas S. Blank,et al.  Exploring the Symbolic/Subsymbolic Continuum: A case study of RAAM , 1992 .

[29]  Dana H. Ballard,et al.  Learning to Perceive and Act , 1990 .

[30]  Inman Harvey,et al.  Issues in evolutionary robotics , 1993 .

[31]  Barak A. Pearlmutter Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.

[32]  Jeffrey L. Elman,et al.  Finding Structure in Time , 1990, Cogn. Sci..

[33]  Richard K. Belew,et al.  Evolving networks: using the genetic algorithm with connectionist learning , 1990 .

[34]  Tariq Samad,et al.  Towards the Genetic Synthesisof Neural Networks , 1989, ICGA.

[35]  Michael Hucka,et al.  Learning in Tele-autonomous Systems using Soar , 1989 .

[36]  Philip E. Agre,et al.  The Symbolic Worldview: Reply to Vera and Simon , 1993, Cogn. Sci..

[37]  Allen Newell,et al.  SOAR: An Architecture for General Intelligence , 1987, Artif. Intell..

[38]  L. D. Whitley,et al.  Genetic Reinforcement Learning for Neurocontrol Problems , 2004, Machine Learning.

[39]  John E. Laird,et al.  Integrating, Execution, Planning, and Learning in Soar for External Environments , 1990, AAAI.

[40]  Allen Newell,et al.  Physical Symbol Systems , 1980, Cogn. Sci..

[41]  Uwe Schnepf,et al.  Robot ethology: a proposal for the research into intelligent autonomous systems , 1991 .

[42]  Stewart W. Wilson The animat path to AI , 1991 .

[43]  A PearlmutterBarak Learning state space trajectories in recurrent neural networks , 1989 .

[44]  Inman Harvey,et al.  Explorations in Evolutionary Robotics , 1993, Adapt. Behav..

[45]  Ben J. A. Kröse,et al.  Distributed adaptive control: The self-organization of structured behavior , 1992, Robotics Auton. Syst..

[46]  Randall D. Beer,et al.  Integrating reactive, sequential, and learning behavior using dynamical neural networks , 1994 .

[47]  Michael C. Mozer,et al.  A Focused Backpropagation Algorithm for Temporal Pattern Recognition , 1989, Complex Syst..

[48]  Brian Yamauchi Dynamical Neural Networks for Mobile Robot Control , 1993 .

[49]  R. A. Brooks,et al.  Intelligence without Representation , 1991, Artif. Intell..

[50]  Ashwin Ram,et al.  A Multistrategy Case-Based and Reinforcement Learning Approach to Self-Improving Reactive Control Systems for Autonomous Robotic Navigation , 1993 .

[51]  Michael Gasser The Structure Grounding Problem , 1993 .

[52]  F. Huntingford Animal Thinking, Donald R. Griffin. Harvard University Press, Cambridge, Massachusetts (1984), ix, +237., Price £6.75 (paperback) , 1986 .

[53]  P. Smolensky On the proper treatment of connectionism , 1988, Behavioral and Brain Sciences.