Monocular Tracking with a Mixture of View-Dependent Learned Models

This paper considers the problem of monocular human body tracking using learned models. We propose to learn the joint probability distribution of appearance and body pose using a mixture of view-dependent models. In such a way the multimodal and nonlinear relationships can be captured reliably. We formulate inference algorithms that are based on generative models while exploiting the advantages of a learned model when compared to the traditionally used geometric body models. Given static images or sequences, body poses and bounding box locations are inferred using silhouette based image descriptors. Prior information about likely body poses and a motion model are taken into account. We consider analytical computations and Monte-Carlo techniques, as well as a combination of both. In a Rao-Blackwellised particle filter, the tracking problem is partitioned into a part that is solved analytically, and a part that is solved with particle filtering. Tracking results are reported for human locomotion

[1]  Michael Isard,et al.  Tracking loose-limbed people , 2004, CVPR 2004.

[2]  Donald G. Bailey,et al.  An Efficient Euclidean Distance Transform , 2004, IWCIA.

[3]  David J. Fleet,et al.  Stochastic Tracking of 3 D Human Figures Using 2 D Image Motion , 2000 .

[4]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[6]  Pascal Fua,et al.  3D Human Body Tracking Using Deterministic Temporal Motion Models , 2004, ECCV.

[7]  Ankur Agarwal,et al.  3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[8]  Ankur Agarwal,et al.  Monocular Human Motion Capture with a Mixture of Regressors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[9]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[10]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[11]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Rómer Rosales,et al.  Learning Body Pose via Specialized Maps , 2001, NIPS.

[13]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Cristian Sminchisescu,et al.  Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.