论文信息 - Predicting human intention in visual observations of hand/object interactions

Predicting human intention in visual observations of hand/object interactions

The main contribution of this paper is a probabilistic method for predicting human manipulation intention from image sequences of human-object interaction. Predicting intention amounts to inferring the imminent manipulation task when human hand is observed to have stably grasped the object. Inference is performed by means of a probabilistic graphical model that encodes object grasping tasks over the 3D state of the observed scene. The 3D state is extracted from RGB-D image sequences by a novel vision-based, markerless hand-object 3D tracking framework. To deal with the high-dimensional state-space and mixed data types (discrete and continuous) involved in grasping tasks, we introduce a generative vector quantization method using mixture models and self-organizing maps. This yields a compact model for encoding of grasping actions, able of handling uncertain and partial sensory data. Experimentation showed that the model trained on simulated data can provide a potent basis for accurate goal-inference with partial and noisy observations of actual real-world demonstrations. We also show a grasp selection process, guided by the inferred human intention, to illustrate the use of the system for goal-directed grasp imitation.

[1] Stefan Schaal,et al. Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[2] Ran,et al. The correspondence problem , 1998 .

[3] D. Wolpert,et al. Mental state inference using visual control parameters. , 2005, Brain research. Cognitive brain research.

[4] Danica Kragic,et al. Grasp Recognition for Programming by Demonstration , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[5] Helge J. Ritter,et al. Robust tracking of human hand postures for robot teaching , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6] Danica Kragic,et al. Multivariate discretization for Bayesian Network structure learning in robot grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[7] I. Ebert‐Uphoff. A Probability-Based Approach to Soft Discretization for Bayesian Networks , 2009 .

[8] Rajesh P. N. Rao,et al. Imitation and Social Learning in Robots, Humans and Animals: A Bayesian model of imitation in infants and robots , 2007 .

[9] M. V. Velzen,et al. Self-organizing maps , 2007 .

[10] Peter K. Allen,et al. Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[11] Danica Kragic,et al. Visual object-action recognition: Inferring object affordances from human demonstration , 2011, Comput. Vis. Image Underst..

[12] Chun-Yi Lin,et al. An Intelligent Model Based on Fuzzy Bayesian Networks to Predict Astrocytoma Malignant Degree , 2006, 2006 IEEE Conference on Cybernetics and Intelligent Systems.

[13] Darius Burschka,et al. Rigid 3D geometry matching for grasping of known objects in cluttered scenes , 2012, Int. J. Robotics Res..

[14] Darius Burschka,et al. An Efficient RANSAC for 3D Object Recognition in Noisy and Occluded Scenes , 2010, ACCV.

[15] Manolis I. A. Lourakis,et al. Real-Time Tracking of Multiple Skin-Colored Objects with a Possibly Moving Camera , 2004, ECCV.

[16] G. Schwarz. Estimating the Dimension of a Model , 1978 .

[17] Danica Kragic,et al. Embodiment-specific representation of robot grasping using graphical models and latent-space discretization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18] Rajesh P. N. Rao,et al. Learning Actions through Imitation and Exploration: Towards Humanoid Robots That Learn from Humans , 2009, Creating Brain-Like Intelligence.

[19] Lawrence D. Fu,et al. A Comparison of Bayesian Network Learning Algorithms from Continuous Data , 2005, AMIA.

[20] Adnan Darwiche,et al. Inference in belief networks: A procedural guide , 1996, Int. J. Approx. Reason..

[21] Manuel Lopes,et al. Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[22] Danica Kragic,et al. Visual recognition of grasps for human-to-robot mapping , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23] Esa Alhoniemi,et al. Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[24] Philippe Leray,et al. BNT STRUCTURE LEARNING PACKAGE : Documentation and Experiments , 2004 .

[25] Etienne E. Kerre,et al. Defuzzification: criteria and classification , 1999, Fuzzy Sets Syst..

[26] Antonis A. Argyros,et al. Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints , 2011, 2011 International Conference on Computer Vision.

[27] Antonis A. Argyros,et al. Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[28] Rajesh P. N. Rao,et al. Goal-Based Imitation as Probabilistic Inference over Graphical Models , 2005, NIPS.

[29] Danica Kragic,et al. Learning task constraints for robot grasping using graphical models , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30] J. Greeno. Gibson's affordances. , 1994, Psychological review.

[31] Bernt Schiele,et al. Functional Object Class Detection Based on Learned Affordance Cues , 2008, ICVS.

[32] Stefan Schaal,et al. Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[33] C. N. Liu,et al. Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[34] Henk Nijmeijer,et al. Robot Programming by Demonstration , 2010, SIMPAR.

[35] Kai Huebner. BADGr - A toolbox for box-based approximation, decomposition and GRasping , 2012, Robotics Auton. Syst..