SB-CoRLA: Schema-Based Constructivist Robot Learning Architecture

This dissertation explores schema-based robot learning. I developed SB-CoRLA (SchemaBased, Constructivist Robot Learning Architecture) to address the issue of constructivist robot learning in a schema-based robot system. The SB-CoRLA architecture extends the previously developed ASyMTRe (Automated Synthesis of Multi-team member Task solutions through software Reconfiguration) architecture to enable constructivist learning for multi-robot team tasks. The schema-based ASyMTRe architecture has successfully solved the problem of automatically synthesizing task solutions based on robot capabilities. However, it does not include a learning ability. Nothing is learned from past experience; therefore, each time a new task needs to be assigned to a new team of robots, the search process for a solution starts anew. Furthermore, it is not possible for the robot to develop a new behavior. The complete SB-CoRLA architecture includes off-line learning and online learning processes. For my dissertation, I implemented a schema chunking process within the framework of SB-CoRLA that involves off-line evolutionary learning of partial solutions (also called “chunks”), and online solution search using learned chunks. The chunks are higher level building blocks than the original schemas. They have similar interfaces to the original schemas, and can be used in an extended version of the ASyMTRe online solution searching process. SB-CoRLA can include other learning processes such as an online learning process that uses a combination of exploration and a goal-directed feedback evaluation process to develop new behaviors by modifying and extending existing schemas. The online learning process is planned for future work. The significance of this work is the development of an architecture that enables continuous, constructivist learning by incorporating learning capabilities in a schema-based robot system, thus allowing robot teams to re-use previous task solutions for both existing and new tasks, to build up more abstract schema chunks, as well as to develop new schemas. The schema chunking process can generate solutions in certain situations when the centralized ASyMTRe cannot find solutions in a timely manner. The chunks can be re-used for different applications, hence improving the search efficiency.

[1]  Brett Browning,et al.  Plays as Effective Multiagent Plans Enabling Opponent-Adaptive Play Selection , 2004, ICAPS.

[2]  Ying Wang,et al.  A machine-learning approach to multi-robot coordination , 2008, Eng. Appl. Artif. Intell..

[3]  Lynne E. Parker,et al.  Towards schema-based, constructivist robot learning: Validating an evolutionary search algorithm for schema chunking , 2008, 2008 IEEE International Conference on Robotics and Automation.

[4]  Michael A. Arbib,et al.  A formal model of computation for sensory-based robotics , 1989, IEEE Trans. Robotics Autom..

[5]  Lynne E. Parker,et al.  ASyMTRe: Automated Synthesis of Multi-Robot Task Solutions through Software Reconfiguration , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[6]  Sarit Kraus,et al.  Feasible Formation of Coalitions Among Autonomous Agents in Nonsuperadditive Environments , 1999, Comput. Intell..

[7]  Bruce Randall Donald,et al.  Analyzing teams of cooperating mobile robots , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.

[8]  Julien Diard,et al.  Learning Bayesian models of sensorimotor interaction: from random exploration toward the discovery of new behaviors , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Allen Newell,et al.  GPS, a program that simulates human thought , 1995 .

[10]  Roderic A. Grupen,et al.  Learning in Non-stationary Conditions: A Control Theoretic Approach , 2000, ICML.

[11]  Shlomo Zilberstein,et al.  Anytime Heuristic Search: First Results , 1997 .

[12]  Bruce Randall Donald,et al.  On Information Invariants in Robotics , 1995, Artif. Intell..

[13]  Alessandro Saffiotti,et al.  The PEIS-Ecology project: Vision and results , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  H. Sebastian Seung,et al.  Learning to Walk in 20 Minutes , 2005 .

[15]  Lynne E. Parker,et al.  Building Multirobot Coalitions Through Automated Task Solution Synthesis , 2006, Proceedings of the IEEE.

[16]  Alessandro Saffiotti,et al.  Cooperative anchoring in heterogeneous multi-robot systems , 2008, 2008 IEEE International Conference on Robotics and Automation.

[17]  Lynne E. Parker,et al.  Layering Coalition Formation With Task Allocation , 2007 .

[18]  Robert Platt,et al.  Improving Grasp Skills Using Schema Structured Learning , 2006 .

[19]  Gary L. Drescher,et al.  Made-up minds - a constructivist approach to artificial intelligence , 1991 .

[20]  Andrew G. Barto,et al.  An Adaptive Robot Motivational System , 2006, SAB.

[21]  R.H. Fujii,et al.  Incremental learning of temporal sequences using state memory and a resource allocating network , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[22]  Harold H. Chaput and Benjamin Kuipers and Risto Miikkulainen Constructivist Learning: A Neural Implementation of the Schema Mechanism , 2003 .

[23]  Bruce Randall Donald,et al.  Information Invariants for Distributed Manipulation , 1995, Int. J. Robotics Res..

[24]  J. Piaget,et al.  The Origins of Intelligence in Children , 1971 .

[25]  Ilya Levner,et al.  Heuristic search for coordinating robot agents in adversarial domains , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[26]  Lynne E. Parker,et al.  Tightly-coupled navigation assistance in heterogeneous multi-robot teams , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[27]  Anthony V. Robins,et al.  The consolidation of learning during sleep: comparing the pseudorehearsal and unlearning accounts , 1999, Neural Networks.

[28]  Jindong Tan,et al.  Distributed multi-robot coordination in area exploration , 2006, Robotics Auton. Syst..

[29]  Terrance L. Huntsberger,et al.  Behavior-based multi-robot collaboration for autonomous construction tasks , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Lynne E. Parker,et al.  Distributed multi-robot coalitions through ASyMTRe-D , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Andrew G. Barto,et al.  An intrinsic reward mechanism for efficient exploration , 2006, ICML.

[32]  J. Bruner Acts of meaning , 1990 .

[33]  Fang Tang,et al.  ASyMTRe: Building Coalitions for Heterogeneous Multi-Robot Teams , 2006 .

[34]  Ronald C. Arkin,et al.  Motor schema based navigation for a mobile robot: An approach to programming by behavior , 1987, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[35]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[36]  Brett Browning,et al.  Dynamically formed heterogeneous robot teams performing tightly-coordinated tasks , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[37]  Alessandro Saffiotti,et al.  Reactive self-configuration of an ecology of robots , 2007 .

[38]  Alessandro Saffiotti,et al.  An introduction to the anchoring problem , 2003, Robotics Auton. Syst..

[39]  Maya Cakmak,et al.  From primitive behaviors to goal-directed behavior using affordances , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40]  Paul S. Schenker,et al.  CAMPOUT: a control architecture for tightly coupled coordination of multirobot systems for planetary surface exploration , 2003, IEEE Trans. Syst. Man Cybern. Part A.

[41]  R. Grupen,et al.  A Relational Representation for Generalized Knowledge in Robotic Tasks , 2004 .

[42]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[43]  Michiel van de Panne,et al.  A planning algorithm for dynamic motions , 1996 .

[44]  John G. Taylor,et al.  Is there more to TSSG than associative chaining (chunking and all that)? , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[45]  Jindong Tan,et al.  Multi-robot Coordination for Elusive Target Interception Aided by Sensor Networks , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[46]  B. Love,et al.  Common Mechanisms in Infant and Adult Category Learning. , 2004, Infancy : the official journal of the International Society on Infant Studies.

[47]  Dominik Henrich,et al.  Automatic adaptation of sensor-based robots , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[48]  Damian M. Lyons,et al.  ADAPT: A Cognitive Architecture for Robotics , 2004, ICCM.

[49]  Alessandro Saffiotti,et al.  PEIS ecologies: ambient intelligence meets autonomous robotics , 2005, sOc-EUSAI '05.

[50]  Andrea Cherubini,et al.  An extended policy gradient algorithm for robot task learning , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[51]  Ronald C. Arkin,et al.  An Behavior-based Robotics , 1998 .

[52]  Alessandro Saffiotti,et al.  Plan-Based Configuration of an Ecology of Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[53]  Andrew G. Barto,et al.  Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.

[54]  Risto Miikkulainen,et al.  The constructivist learning architecture: a model of cognitive development for robust autonomous robots , 2004 .

[55]  Robert Platt,et al.  Re-using schematic grasping policies , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[56]  John N. Tsitsiklis,et al.  Actor-Critic Algorithms , 1999, NIPS.

[57]  Fang Tang,et al.  Coalescent multi-robot teaming through ASyMTRe: a formal analysis , 2005, ICAR '05. Proceedings., 12th International Conference on Advanced Robotics, 2005..

[58]  Dominik Henrich,et al.  Discontinuuty detection for force-based manipulation , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[59]  J. Piaget The Psychology Of Intelligence , 1951 .

[60]  M. Arbib,et al.  Infant grasp learning: a computational model , 2004, Experimental Brain Research.

[61]  Ying Wang,et al.  Cooperative Transportation by Multiple Robots with Machine Learning , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[62]  Michael A. Arbib,et al.  Schema theory , 1998 .