Kinematic self retargeting: A framework for human pose estimation

This paper presents a model-based, Cartesian control theoretic approach for estimating human pose from a set of key features points (key-points) detected using depth images obtained from a time-of-flight imaging device. The key-points represent positions of anatomical landmarks, detected and tracked over time based on a probabilistic inferencing algorithm that is robust to partial occlusions and capable of resolving ambiguities in detection. The detected key-points are subsequently kinematically self retargeted, or mapped to the subject's own kinematic model, in order to predict the pose of an articulated human model at the current state, resolve ambiguities in key-point detection, and provide estimates of missing or intermittently occluded key-points. Based on a standard kinematic and mesh model of a human, constraints such as joint limit avoidance, and self-penetration avoidance are enforced within the retargeting framework. Effectiveness of the algorithm is demonstrated experimentally for upper and full-body pose reconstruction from a small set of detected key-points. On average, the proposed algorithm runs at approximately 10 frames per second for the upper-body and 5 frames per second for whole body reconstruction on a standard 2.13GHz laptop PC.

[1]  Camillo J. Taylor,et al.  Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image , 2000, Comput. Vis. Image Underst..

[2]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[3]  Nuno Vasconcelos,et al.  Mixtures of dynamic textures , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[6]  Kikuo Fujimura,et al.  Constrained Optimization for Human Pose Estimation from Depth Sequences , 2007, ACCV.

[7]  Camillo J. Taylor,et al.  Reconstruction of articulated objects from point correspondences in a single uncalibrated image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  Behzad Dariush,et al.  Online Transfer of Human Motion to Humanoids , 2009, Int. J. Humanoid Robotics.

[9]  Takeo Kanade,et al.  Model-based tracking of self-occluding articulated objects , 1995, Proceedings of IEEE International Conference on Computer Vision.

[10]  M. Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, CVPR 2004.

[11]  Stan Sclaroff,et al.  Estimating 3D hand pose from a cluttered image , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[13]  Bruno Siciliano,et al.  Closed-Loop Inverse Kinematics Schemes for Constrained Redundant Manipulators with Task Space Augmentation and Task Priority Strategy , 1991, Int. J. Robotics Res..

[14]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Rüdiger Dillmann,et al.  Sensor fusion for 3D human body tracking with an articulated 3D body model , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[16]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Michael Gleicher,et al.  Retargetting motion to new characters , 1998, SIGGRAPH.

[18]  Jianbo Shi,et al.  Tele-graffiti: a pen and paper-based remote sketching system , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Rajiv V. Dubey,et al.  A weighted least-norm solution based scheme for avoiding joint limits for redundant joint manipulators , 1993, IEEE Trans. Robotics Autom..

[20]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[21]  AgarwalAnkur,et al.  Recovering 3D Human Pose from Monocular Images , 2006 .

[22]  Paul A. Viola,et al.  Learning silhouette features for control of human motion , 2004, SIGGRAPH '04.

[23]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jitendra Malik,et al.  Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.

[25]  Julius Ziegler,et al.  Tracking of the Articulated Upper Body on Multi-View Stereo Image Sequences , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[26]  Trevor Darrell,et al.  Avoiding the "streetlight effect": tracking by exploring likelihood modes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  Ioannis A. Kakadiaris,et al.  Estimating anthropometry and pose from a single image , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[28]  Behzad Dariush,et al.  Controlled human pose estimation from depth image streams , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[29]  Trevor Darrell,et al.  Conditional Random People: Tracking Humans with CRFs and Grid Filters , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Reinhard Koch,et al.  Nonlinear Body Pose Estimation from Depth Images , 2005, DAGM-Symposium.

[31]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.