3D Shape-Encoded Particle Filter for Object Tracking and Its Application to Human Body Tracking

We present a nonlinear state estimation approach using particle filters, for tracking objects whose approximate 3D shapes are known. The unnormalized conditional density for the solution to the nonlinear filtering problem leads to the Zakai equation, and is realized by the weights of the particles. The weight of a particle represents its geometric and temporal fit, which is computed bottom-up from the raw image using a shape-encoded filter. The main contribution of the paper is the design of smoothing filters for feature extraction combined with the adoption of unnormalized conditional density weights. The "shape filter" has the overall form of the predicted 2D projection of the 3D model, while the cross-section of the filter is designed to collect the gradient responses along the shape. The 3D-model-based representation is designed to emphasize the changes in 2D object shape due to motion, while de-emphasizing the variations due to lighting and other imaging conditions. We have found that the set of sparse measurements using a relatively small number of particles is able to approximate the high-dimensional state distribution very effectively. As a measures to stabilize the tracking, the amount of random diffusion is effectively adjusted using a Kalman updating of the covariance matrix. For a complex problem of human body tracking, we have successfully employed constraints derived from joint angles and walking motion.

[1]  Daniel Cremers,et al.  Dynamical statistical shape priors for level set-based tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  A. Bensoussan Stochastic Control of Partially Observable Systems , 1992 .

[4]  Jing Xiao,et al.  Robust full‐motion recovery of head by dynamic templates and re‐registration techniques , 2003 .

[5]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[6]  Dimitris N. Metaxas,et al.  Optical Flow Constraints on Deformable Models with Applications to Face Tracking , 2000, International Journal of Computer Vision.

[7]  Hironobu Fujiyoshi,et al.  Moving target classification and tracking from real-time video , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[8]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.

[9]  Donald B. Gennery,et al.  Visual tracking of known three-dimensional objects , 1992, International Journal of Computer Vision.

[10]  James L. Crowley,et al.  Probabilistic recognition of activity using local appearance , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[11]  David A. Forsyth,et al.  Tracking People by Learning Their Appearance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Alex Pentland,et al.  Recursive Estimation of Motion, Structure, and Focal Length , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Michael Isard,et al.  Object localization by Bayesian correlation , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[14]  Martin A. Giese,et al.  Combining View-Based and Model-Based Tracking of Articulated Human Movements , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[15]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  R. Chellappa,et al.  Recursive 3-D motion estimation from a monocular image sequence , 1990 .

[17]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[18]  Andrew Blake,et al.  Probabilistic tracking in a metric space , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[19]  Mun Wai Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[20]  M. Zakai On the optimal filtering of diffusion processes , 1969 .

[21]  G. Kitagawa Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models , 1996 .

[22]  Lennart Ljung,et al.  The Extended Kalman Filter as a Parameter Estimator for Linear Systems , 1979 .

[23]  Jing Xiao,et al.  Robust full-motion recovery of head by dynamic templates and re-registration techniques , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[24]  Z. S. Haddad,et al.  Filtering Image Records Using Wavelets and the Zakai Equation , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Jun S. Liu,et al.  Sequential Monte Carlo methods for dynamic systems , 1997 .

[26]  Andrew Blake,et al.  Tracking through singularities and discontinuities by random sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[27]  Dan Crisan,et al.  Convergence of a Branching Particle Method to the Solution of the Zakai Equation , 1998, SIAM J. Appl. Math..

[28]  Marco La Cascia,et al.  Fast, Reliable Head Tracking under Varying Illumination: An Approach Based on Registration of Texture-Mapped 3D Models , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Azriel Rosenfeld,et al.  Optimal edge-based shape detection , 2002, IEEE Trans. Image Process..

[30]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[31]  José M. F. Moura,et al.  Capture and Representation of Human Walking in Live Video Sequences , 1999, IEEE Trans. Multim..