Using expectations to monitor robotic progress and recover from problems

How does a robot know when something goes wrong? Our research answers this question by leveraging expectations - predictions about the immediate future – and using the mismatch between the expectations and the external world to monitor the robot’s progress. We use the cognitive architecture ACT-R (Adaptive Control of Thought - Rational) to learn the associations between the current state of the robot and the world, the action to be performed in the world, and the future state of the world. These associations are used to generate expectations that are then matched by the architecture with the next state of the world. A significant mismatch between these expectations and the actual state of the world indicate a problem possibly resulting from unexpected consequences of the robot’s actions, unforeseen changes in the environment or unanticipated actions of other agents. When a problem is detected, the recovery model can suggest a number of recovery options. If the situation is unknown, that is, the mismatch between expectations and the world is novel, the robot can use a recovery solution from a set of heuristic options. When a recovery option is successfully applied, the robot learns to associate that recovery option with the mismatch. When the same problem is encountered later, the robot can apply the learned recovery solution rather than using the heuristics or randomly exploring the space of recovery solutions. We present results from execution monitoring and recovery performed during an assessment conducted at the Combined Arms Collective Training Facility (CACTF) at Fort Indiantown Gap.

[1]  Randolph M. Jones,et al.  Comparing Modeling Idioms in ACT-R and Soar , 2007 .

[2]  Cleotilde Gonzalez,et al.  Instance-based learning in dynamic decision making , 2003, Cogn. Sci..

[3]  Christian Lebiere,et al.  Sequence Learning in the ACT-R Cognitive Architecture: Empirical Analysis of a Hybrid Model , 2001, Sequence Learning.

[4]  Martial Hebert,et al.  Using Expectations to Drive Cognitive Behavior , 2012, AAAI.

[5]  John E. Laird,et al.  Online Determination of Value-Function Structure and Action-value Estimates for Reinforcement Learning in a Cognitive Architecture , 2012 .

[6]  Daniel D. Lee,et al.  The University of Pennsylvania MAGIC 2010 multi‐robot unmanned vehicle system , 2012, J. Field Robotics.

[7]  D. Reitter,et al.  Resistance is Futile: Winning Lemonade Market Share through Metacognitive Reasoning in a Three-Agent Cooperative Game , 2010 .

[8]  Alvin E. Roth,et al.  A choice prediction competition: Choices from experience and from description , 2010 .

[9]  Christian Lebiere,et al.  The dynamics of cognition: An ACT-R model of cognitive arithmetic , 1999, Kognitionswissenschaft.

[10]  John R. Anderson How Can the Human Mind Occur in the Physical Universe , 2007 .

[11]  L. R. Rabiner,et al.  A comparative study of several dynamic time-warping algorithms for connected-word recognition , 1981, The Bell System Technical Journal.

[12]  David Reitter Metacognition and Multiple Strategies in a Cognitive Model of Online Control , 2010, J. Artif. Gen. Intell..

[13]  C. Lebiere,et al.  The Atomic Components of Thought , 1998 .

[14]  D. Fum Instance vs . Rule Based Learning in Controlling a Dynamic System , 2003 .

[15]  Ashwin Ram,et al.  A Multistrategy Case-Based and Reinforcement Learning Approach to Self-Improving Reactive Control Systems for Autonomous Robotic Navigation , 1993 .

[16]  C. Lebiere,et al.  Models of Working Memory: Modeling Working Memory in a Unified Architecture: An ACT-R Perspective , 1999 .

[17]  Dieter Wallach,et al.  Whether skill acquisition is rule or instance based is determined by the structure of the task. , 2002 .