Learning object, grasping and manipulation activities using hierarchical HMMs

This article presents a probabilistic algorithm for representing and learning complex manipulation activities performed by humans in everyday life. The work builds on the multi-level Hierarchical Hidden Markov Model (HHMM) framework which allows decomposition of longer-term complex manipulation activities into layers of abstraction whereby the building blocks can be represented by simpler action modules called action primitives. This way, human task knowledge can be synthesised in a compact, effective representation suitable, for instance, to be subsequently transferred to a robot for imitation. The main contribution is the use of a robust framework capable of dealing with the uncertainty or incomplete data inherent to these activities, and the ability to represent behaviours at multiple levels of abstraction for enhanced task generalisation. Activity data from 3D video sequencing of human manipulation of different objects handled in everyday life is used for evaluation. A comparison with a mixed generative-discriminative hybrid model HHMM/SVM (support vector machine) is also presented to add rigour in highlighting the benefit of the proposed approach against comparable state of the art techniques.

[1]  Eren Erdal Aksoy,et al.  Learning the semantics of object–action relations by observation , 2011, Int. J. Robotics Res..

[2]  Christiaan J. J. Paredis,et al.  Interactive Multimodal Robot Programming , 2005, Int. J. Robotics Res..

[3]  Gerhard Rigoll,et al.  A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition , 2004, INTERSPEECH.

[4]  Paolo Fiorini,et al.  Hybrid HMM/SVM model for the analysis and segmentation of teleoperation tasks , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[5]  Danica Kragic,et al.  Multivariate discretization for Bayesian Network structure learning in robot grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[6]  Naveen Vignesh Ramaraj Location Based Activity Recognition Using Mobile Phones , 2009 .

[7]  Maja J. Mataric,et al.  Performance-Derived Behavior Vocabularies: Data-Driven Acquisition of Skills from Motion , 2004, Int. J. Humanoid Robotics.

[8]  Rajat Raina,et al.  Classification with Hybrid Generative/Discriminative Models , 2003, NIPS.

[9]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[10]  Samy Bengio,et al.  Hybrid generative-discriminative models for speech and speaker recognition , 2002 .

[11]  Danica Kragic,et al.  Embodiment-specific representation of robot grasping using graphical models and latent-space discretization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Michael C. Horsch,et al.  Dynamic Bayesian networks , 1990 .

[13]  Antonis A. Argyros,et al.  Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints , 2011, 2011 International Conference on Computer Vision.

[14]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[15]  Dana Kulic,et al.  Learning Action Primitives , 2011, Visual Analysis of Humans.

[16]  C. Heinze Modelling Intention Recognition for Intelligent Agent Systems , 2004 .

[17]  Yoram Singer,et al.  The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.

[18]  O. Cappé,et al.  On‐line expectation–maximization algorithm for latent data models , 2009 .

[19]  Rüdiger Dillmann,et al.  Learning Robot Behaviour and Skills Based on Human Demonstration and Advice: The Machine Learning Paradigm , 2000 .

[20]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[21]  Ales Ude,et al.  Action sequencing using dynamic movement primitives , 2011, Robotica.

[22]  Koichiro Deguchi,et al.  Hierarchical-HMM Based Recognition of Human Activity , 2005 .

[23]  Gregory D. Hager,et al.  Human-Machine Collaborative Systems for Microsurgical Applications , 2005, Int. J. Robotics Res..

[24]  Jun Nakanishi,et al.  Movement imitation with nonlinear dynamical systems in humanoid robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[25]  Finn Verner Jensen,et al.  Introduction to Bayesian Networks , 2008, Innovations in Bayesian Networks.

[26]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[27]  Svetha Venkatesh,et al.  Learning and detecting activities from movement trajectories using the hierarchical hidden Markov model , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[28]  Jun Nakanishi,et al.  Learning Movement Primitives , 2005, ISRR.

[29]  Danica Kragic,et al.  Learning Actions from Observations , 2010, IEEE Robotics & Automation Magazine.

[30]  Haris Dindo,et al.  An adaptive probabilistic approach to goal-level imitation learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  G. Rizzolatti,et al.  Neurophysiological mechanisms underlying the understanding and imitation of action , 2001, Nature Reviews Neuroscience.

[32]  Darius Burschka,et al.  An Efficient RANSAC for 3D Object Recognition in Noisy and Occluded Scenes , 2010, ACCV.

[33]  Darren Newtson,et al.  The objective basis of behavior units. , 1977 .

[34]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[35]  G. Dissanayake,et al.  A Hierarchical Hidden Markov Model to support activities of daily living with an assistive robotic walker , 2012, 2012 4th IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob).

[36]  Stefan Schaal,et al.  Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[37]  Aude Billard,et al.  Imitation learning of globally stable non-linear point-to-point robot motions using nonlinear programming , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38]  A. P. Dawid,et al.  Generative or Discriminative? Getting the Best of Both Worlds , 2007 .

[39]  Maja Pantic,et al.  Combined Support Vector Machines and Hidden Markov Models for Modeling Facial Action Temporal Dynamics , 2007, ICCV-HCI.

[40]  Takayuki Okatani,et al.  HHMM Based Recognition of Human Activity Motion Trajectories in Image Sequences , 2005, MVA.

[41]  Ching Y. Suen,et al.  Classification of time-series data using a generative/discriminative hybrid , 2004, Ninth International Workshop on Frontiers in Handwriting Recognition.