Robot Task Interruption by Learning to Switch Among Multiple Models

While mobile robots reliably perform each service task by accurately localizing and safely navigating avoiding obstacles, they do not respond in any other way to their surroundings. We can make the robots more responsive to their environment by equipping them with models of multiple tasks and a way to interrupt a specific task and switch to another task based on observations. However the challenges of a multiple task model approach include selecting a task model to execute based on observations and having a potentially large set of observations associated with the set of all individual task models. We present a novel two-step solution. First, our approach leverages the tasks’ policies and an abstract representation of their states, and learns which task should be executed at each given world state. Secondly, the algorithm uses the learned tasks and identifies the observation stimuli that trigger the interruption of one task and the switch to another task. We show that our solution using the switching stimuli compares favorably to the naive approach of learning a combined model for all the tasks. Moreover, leveraging the stimuli significantly decreases the amount of sensory input processing during the execution of tasks.

[1]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[2]  Mark Humphreys,et al.  Action selection methods using reinforcement learning , 1997 .

[3]  Ronald C. Arkin,et al.  Robot behavioral selection using q-learning , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  George Konidaris,et al.  Constructing Abstraction Hierarchies Using a Skill-Symbol Loop , 2015, IJCAI.

[5]  Oussama Khatib,et al.  Springer Handbook of Robotics , 2007, Springer Handbooks.

[6]  R. Lathe Phd by thesis , 1988, Nature.

[7]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[8]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[9]  Tom Schaul,et al.  Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.

[10]  D. Cliff From animals to animats , 1994, Nature.

[11]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[12]  J. Meigs,et al.  WHO Technical Report , 1954, The Yale Journal of Biology and Medicine.

[13]  François Michaud,et al.  Behavior-Based Systems , 2008, Springer Handbook of Robotics.

[14]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[15]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[16]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[17]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[18]  Jonathan P. How,et al.  Approximate dynamic programming using support vector regression , 2008, 2008 47th IEEE Conference on Decision and Control.

[19]  François Michaud,et al.  Improving Situated Agents Adaptability Using Interruption Theory of Emotions , 2008, SAB.

[20]  Matthew Klenk,et al.  Breadth of Approaches to Goal Reasoning: A Research Survey , 2013 .

[21]  Héctor Muñoz-Avila,et al.  Goal Reasoning: Papers from the ACS workshop , 2013 .

[22]  Monica N. Nicolescu,et al.  Learning Behavior Fusion Estimation from Demonstration , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[23]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[24]  Doina Precup,et al.  The Option-Critic Architecture , 2016, AAAI.

[25]  Hector Muñoz-Avila,et al.  Guiding the Ass with Goal Motivation Weights , 2015 .

[26]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[27]  Jonas Karlsson,et al.  Learning to Solve Multiple Goals , 1997 .

[28]  Balaraman Ravindran,et al.  Improved Switching among Temporally Abstract Actions , 1998, NIPS.

[29]  Paolo Pirjanian,et al.  Behavior Coordination Mechanisms - State-of-the-art , 1999 .

[30]  David W. Aha,et al.  Learning and Reusing Goal-Specific Policies for Goal-Driven Autonomy , 2012, ICCBR.

[31]  Dongkyu Choi,et al.  Reactive goal management in a cognitive architecture , 2011, Cognitive Systems Research.

[32]  Stephanie Rosenthal,et al.  CoBots: Robust Symbiotic Autonomous Mobile Service Robots , 2015, IJCAI.

[33]  Tarek M. Sobh Discrete Event Dynamic Systems: An Overview , 1991 .