Learning Behaviors Models for Robot Execution Control

Robust execution of robotic tasks is a difficult problem. In many situations, these tasks involve complex behaviors combining different functionalities (e.g. perception, localization, motion planning and motion execution). These behaviors are often programmed with a strong focus on the robustness of the behavior itself, not on the definition of a “high level” model to be used by a task planner and an execution controller. We propose to learn behaviors models as structured stochastic processes: Dynamic Bayesian Network. Indeed, the DBN formalism allows us to learn and control behaviors with controllable parameters. We experimented our approach on a real robot, where we learned over a large number of runs the model of a complex navigation task using a modified version of Expectation Maximization for DBN. The resulting DBN is then used to control the robot navigation behavior and we show that for some given objectives (e.g. avoid failure, optimize speed), the learned DBN driven controller performs much better than the programmed controller. We also show a way to achieve efficient incremental learning of the DBN. We believe that the proposed approach remains generic and can be used to learn complex behaviors other than navigation and for other autonomous systems.

[1]  T. Michael Knasel,et al.  Robotics and autonomous systems , 1988, Robotics Auton. Syst..

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  Keiji Kanazawa,et al.  A model for reasoning about persistence and causation , 1989 .

[5]  Nevin Lianwen Zhang,et al.  A computational theory of decision networks , 1993, Int. J. Approx. Reason..

[6]  KwangYun Wohn,et al.  Recognition of space-time hand-gestures using hidden Markov model , 1996, VRST.

[7]  Rachid Alami,et al.  PRS: a high level supervision and control language for autonomous mobile robots , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[8]  Reid G. Simmons,et al.  Passive Distance Learning for Robot Navigation , 1996, ICML.

[9]  Reid G. Simmons,et al.  Unsupervised learning of probabilistic models for robot navigation , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[10]  Nir Friedman,et al.  The Bayesian Structural EM Algorithm , 1998, UAI.

[11]  Xavier Boyen,et al.  Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[12]  Daphne Koller,et al.  Using Learning for Approximation in Stochastic Processes , 1998, ICML.

[13]  Aaron F. Bobick,et al.  Parametric Hidden Markov Models for Gesture Recognition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Paul R. Cohen,et al.  A Method for Clustering the Experiences of a Mobile Robot that Accords with Human Judgments , 2000, AAAI/IAAI.

[15]  Thierry Siméon,et al.  Around the Lab in 40 days ... , 2000 .

[16]  Paul R. Cohen,et al.  Learning Planning Operators in Real-World, Partially Observable Environments , 2000, AIPS.

[17]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Aaron F. Bobick,et al.  Hidden Markov Models for Modeling and Recognizing Gesture Under Variation , 2001, Int. J. Pattern Recognit. Artif. Intell..

[19]  Armin B. Cremers,et al.  Learning action models for the improved execution of navigation plans , 2002, Robotics Auton. Syst..

[20]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[21]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Svetha Venkatesh,et al.  Policy Recognition in the Abstract Hidden Markov Model , 2002, J. Artif. Intell. Res..

[23]  Thomas S. Huang,et al.  Facial Expression Recognition from Video Sequences : Temporal and Static Modelling , 2002 .

[24]  Hung Hai Bui,et al.  A General Model for Online Probabilistic Plan Recognition , 2003, IJCAI.

[25]  Nicu Sebe,et al.  Facial expression recognition from video sequences: temporal and static modeling , 2003, Comput. Vis. Image Underst..

[26]  Javier Minguez,et al.  A "divide and conquer" strategy based on situations to achieve reactive collision avoidance in troublesome scenarios , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[27]  Sridhar Mahadevan,et al.  Learning hierarchical models of activity , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[28]  Michael Beetz,et al.  Optimized Execution of Action Chains Using Learned Performance Models of Abstract Actions , 2005, IJCAI.

[29]  Yiannis Demiris,et al.  Learning Forward Models for Robots , 2005, IJCAI.

[30]  Eyal Amir,et al.  Learning Partially Observable Deterministic Action Models , 2005, IJCAI.

[31]  R. Chatila,et al.  Supervision and interaction , 2005, ICAR '05. Proceedings., 12th International Conference on Advanced Robotics, 2005..

[32]  Malik Ghallab,et al.  Robot introspection through learned hidden Markov models , 2006, Artif. Intell..

[33]  Henry A. Kautz,et al.  Learning and inferring transportation routines , 2004, Artif. Intell..