Tracking articulated bodies using Generalized Expectation Maximization

A generalized expectation maximization (GEM) algorithm is used to retrieve the pose of a person from a monocular video sequence shot with a moving camera. After embedding the set of possible poses in a low dimensional space using principal component analysis, the configuration that gives the best match to the input image is held as estimate for the current frame. This match is computed iterating GEM to assign edge pixels to the correct body part and to find the body pose that maximizes the likelihood of the assignments.

[1]  Radu Horaud,et al.  The Alignment Between 3-D Data and Articulated Shapes with Bending Surfaces , 2006, ECCV.

[2]  Vincent Lepetit,et al.  Bridging the Gap between Detection and Tracking for 3D Monocular Video-Based Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  J. Odobez,et al.  Separation of Moving Regions from Background in an Image Sequence Acquired with a Mobil Camera , 1997 .

[4]  B. Triggs,et al.  Tracking Articulated Motion with Piecewise Learned Dynamical Models , 2004 .

[5]  Bruno Raffin,et al.  3D Skeleton-Based Body Pose Recovery , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[6]  Trevor Darrell,et al.  Conditional Random People: Tracking Humans with CRFs and Grid Filters , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[8]  Hans-Peter Seidel,et al.  Scaled Motion Dynamics for Markerless Motion Capture , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  Fenguangzhai Song CD , 1992 .

[12]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Michael J. Black,et al.  Learning and Tracking Cyclic Human Motion , 2000, NIPS.

[14]  Gilles Celeux,et al.  EM procedures using mean field-like approximations for Markov model-based image segmentation , 2003, Pattern Recognit..

[15]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[16]  Haluk Derin,et al.  Video Data Compression for Multimedia Computing , 1997 .

[17]  James M. Rehg,et al.  Analyzing articulated motion using expectation-maximization , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[19]  David J. Fleet,et al.  Physics-Based Person Tracking Using Simplified Lower-Body Dynamics , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.