Learning the structure of activities for a mobile robot

At birth, the human infant has only a very rudimentary perceptual system and similarly rudimentary control over its musculature. As time goes on, a child develops. Its ability to control, perceive, and predict its own behavior improves as it interacts with its environment. We are interested in the process of development, in particular with respect to activity. How might an intelligent agent of our own design learn to represent and organize procedural knowledge so that over time it becomes more competent at its achieving goals in its own environment? In this dissertation, we present a system that allows an agent to learn models of activity and its environment and then use those models to create units of behavior of increasing sophistication for the purpose of achieving its own internally-generated goals.

[1]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[2]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[3]  Paul R. Cohen,et al.  Learning what is relevant to the effects of actions for a mobile robot , 1998, AGENTS '98.

[4]  Paul J. Werbos,et al.  Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[5]  Sebastian Thrun,et al.  The role of exploration in learning control , 1992 .

[6]  David D. Jensen,et al.  Adjusting for Multiple Comparisons in Decision Tree Pruning , 1997, KDD.

[7]  H. Ginsburg,et al.  Piaget's theory of intellectual development , 1969 .

[8]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[9]  Brian Everitt,et al.  Cluster analysis , 1974 .

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Terrence J. Sejnowski,et al.  TD(λ) Converges with Probability 1 , 1994, Machine Learning.

[12]  G. Jasso Review of "International Encyclopedia of Statistical Sciences, edited by Samuel Kotz, Norman L. Johnson, and Campbell B. Read, New York, Wiley, 1982-1988" , 1989 .

[13]  Jude W. Shavlik,et al.  Learning Symbolic Rules Using Artificial Neural Networks , 1993, ICML.

[14]  Allen Newell,et al.  GPS, a program that simulates human thought , 1995 .

[15]  M. Rothbart,et al.  Attention in Early Development: Themes and Variations , 1996 .

[16]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[17]  Eugene Fink,et al.  Integrating planning and learning: the PRODIGY architecture , 1995, J. Exp. Theor. Artif. Intell..

[18]  G. Lakoff,et al.  Metaphors We Live By , 1980 .

[19]  Paul R. Cohen,et al.  Continuous Categories For a Mobile Robot , 1999, AAAI/IAAI.

[20]  Yolanda Gil,et al.  Learning by Experimentation: Incremental Refinement of Incomplete Planning Domains , 1994, International Conference on Machine Learning.

[21]  Michael L. Littman,et al.  Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach , 1993, NIPS.

[22]  Satinder P. Singh,et al.  Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.

[23]  Richard S. Sutton,et al.  Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.

[24]  Paul R. Cohen,et al.  Applicability of Reinforcement Learning , 1998 .

[25]  Gerald Jay Sussman,et al.  A Computer Model of Skill Acquisition , 1975 .

[26]  Nathan Intrator,et al.  Interpreting neural-network results: a simulation study , 2001 .

[27]  Andrew G. Barto,et al.  Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[28]  Paul R. Cohen,et al.  Concepts From Time Series , 1998, AAAI/IAAI.

[29]  Allen Newell,et al.  Human Problem Solving. , 1973 .

[30]  Milos Hauskrecht,et al.  Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.

[31]  Jordan B. Pollack,et al.  Representation of information in neural networks , 2000 .

[32]  Oren Etzioni,et al.  PRODIGY4.0: The Manual and Tutorial , 1992 .

[33]  Paul R. Cohen,et al.  Grounding knowledge in sensors: unsupervised learning for language and planning , 2001 .

[34]  Xuemei Wang,et al.  Learning Planning Operators by Observation and Practice , 1994, AIPS.

[35]  Leslie Pack Kaelbling,et al.  Hierarchical Learning in Stochastic Domains: Preliminary Results , 1993, ICML.

[36]  Richard Fikes,et al.  Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..

[37]  Xuemei Wang,et al.  Learning by Observation and Practice: An Incremental Approach for Planning Operator Acquisition , 1995, ICML.

[38]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[39]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[40]  Karen Zita Haigh,et al.  Learning situation-dependent costs: improving planning from probabilistic robot execution , 1998, AGENTS '98.

[41]  Yolanda Gil,et al.  Acquiring domain knowledge for planning by experimentation , 1992 .

[42]  Scott Benson,et al.  Inductive Learning of Reactive Action Models , 1995, ICML.

[43]  Simon Kasif,et al.  A System for Induction of Oblique Decision Trees , 1994, J. Artif. Intell. Res..

[44]  R. Bellman Dynamic programming. , 1957, Science.

[45]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[46]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[47]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[48]  Daniel S. Weld,et al.  Temporal Planning with Continuous Change , 1994, AAAI.

[49]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[50]  Maja J. Mataric,et al.  Reward Functions for Accelerated Learning , 1994, ICML.

[51]  George Lakoff,et al.  Women, Fire, and Dangerous Things , 1987 .

[52]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[53]  Paul E. Utgoff,et al.  Decision Tree Induction Based on Efficient Tree Restructuring , 1997, Machine Learning.

[54]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[55]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[56]  Michael E Atwood,et al.  A process model for water jug problems , 1976, Cognitive Psychology.