Recovering upper-body motion using a reinitialization particle filter

Abstract. An important problem associated with particle filtering is track drift, which is caused by the inaccurate likelihood measurement function and state model; another problem is the heavy computation cost in high state space. A human body is an articulated object with high degrees of freedom, and humans often perform versatile unconstrained motions. Therefore, these two problems are invariably encountered when human motion is reconstructed from monocular video sequences using particle filters. To overcome these problems, we present a novel approach to recover three-dimensional human upper-body pose using a combination of a deterministic and stochastic method that takes full advantage of the benefits of both methods—highly accurate pose reconstruction and low computation cost. The reconstruction of the human upper-body pose is divided into two parts: global pose estimation and pose estimation of the remaining joints. The global pose is determined by solving a system of six nonlinear equations established by three scale invariant feature transform (SIFT) correspondences within the left and right shoulder segments. Estimation of the pose of the remaining joints is accomplished using two particle filters, one for left arm pose estimation and the other for right arm pose estimation. The image projection of the segment model is obtained by forward kinematics under a perspective camera model, and the likelihood measurement functions involving two features are presented to enhance the model fitting performance. To avoid track drift, the particle filters can reinitialize particle sets if most of the particles deviate from the target object. The probability distribution of the new particle set is modeled by a Gaussian mixture model, in which each Gaussian is determined by matched SIFT correspondences. Experimental results show that our proposed approach effectively and efficiently achieves human upper-body tracking and is more accurate than the standard particle filter.

[1]  Juergen Gall,et al.  Optimization and Filtering for Human Motion Capture , 2010, International Journal of Computer Vision.

[2]  Ronald Poppe,et al.  Vision-based human motion analysis: An overview , 2007, Comput. Vis. Image Underst..

[3]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  I-Cheng Chang,et al.  3D human motion tracking based on a progressive particle filter , 2010, Pattern Recognit..

[5]  Jitendra Malik,et al.  Twist Based Acquisition and Tracking of Animal and Human Kinematics , 2004, International Journal of Computer Vision.

[6]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Xu Zhao,et al.  Generative tracking of 3D human motion by hierarchical annealed genetic algorithm , 2008, Pattern Recognit..

[8]  Cristian Sminchisescu,et al.  Structural SVM for visual localization and continuous state estimation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Bernt Schiele,et al.  Monocular 3D pose estimation and tracking by detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Tieniu Tan,et al.  Visual tracking via incremental self-tuning particle filtering on the affine group , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Huiyu Zhou,et al.  Object tracking using SIFT features and mean shift , 2009, Comput. Vis. Image Underst..

[12]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[13]  Ben Taskar,et al.  Adaptive pose priors for pictorial structures , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Vittorio Ferrari,et al.  We Are Family: Joint Pose Estimation of Multiple Persons , 2010, ECCV.

[16]  Ben Taskar,et al.  Parsing human motion with stretchable models , 2011, CVPR 2011.

[17]  Catherine Achard,et al.  Real and Simulated Upper Body Tracking with Annealing Particle Filter and Belief Propagation for Human-Robot Interaction , 2011, Int. J. Humanoid Robotics.

[18]  Trevor Darrell,et al.  Sparse probabilistic regression for activity-independent human pose inference , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Beiji Zou,et al.  Automatic reconstruction of 3D human motion pose from uncalibrated monocular video sequences based on markerless human motion tracking , 2009, Pattern Recognition.

[20]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[21]  Stefan Carlsson,et al.  Monocular 3D Reconstruction of Human Motion in Long Action Sequences , 2004, ECCV.

[22]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Bernt Schiele,et al.  Articulated people detection and pose estimation: Reshaping the future , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Kyoung Mu Lee,et al.  Visual tracking via geometric particle filtering on the affine group with optimal importance functions , 2009, CVPR.

[25]  Bodo Rosenhahn,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Combined Region-and Motion-based 3d Tracking of Rigid and Articulated Objects , 2022 .

[26]  Rachid Deriche,et al.  A Robust Technique for Matching two Uncalibrated Images Through the Recovery of the Unknown Epipolar Geometry , 1995, Artif. Intell..

[27]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[28]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, CVPR.

[29]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[30]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[31]  Yuan Li,et al.  Tracking in Low Frame Rate Video: A Cascade Particle Filter with Discriminative Observers of Different Lifespans , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Xianghua Xie,et al.  Tracking 3D human pose with large root node uncertainty , 2011, CVPR 2011.

[35]  Tieniu Tan,et al.  Real time hand tracking by combining particle filtering and mean shift , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..