A spatio-temporal 2D-models framework for human pose recovery in monocular sequences

This paper addresses the pose recovery problem of a particular articulated object: the human body. In this model-based approach, the 2D-shape is associated to the corresponding stick figure allowing the joint segmentation and pose recovery of the subject observed in the scene. The main disadvantage of 2D-models is their restriction to the viewpoint. To cope with this limitation, local spatio-temporal 2D-models corresponding to many views of the same sequences are trained, concatenated and sorted in a global framework. Temporal and spatial constraints are then considered to build the probabilistic transition matrix (PTM) that gives a frame to frame estimation of the most probable local models to use during the fitting procedure, thus limiting the feature space. This approach takes advantage of 3D information avoiding the use of a complex 3D human model. The experiments carried out on both indoor and outdoor sequences have demonstrated the ability of this approach to adequately segment pedestrians and estimate their poses independently of the direction of motion during the sequence.

[1]  Adrian Hilton,et al.  Viewpoint invariant exemplar-based 3D human tracking , 2006, Comput. Vis. Image Underst..

[2]  Erkki Oja,et al.  Independent Component Analysis , 2001 .

[3]  Ioannis A. Kakadiaris,et al.  Model-Based Estimation of 3D Human Motion , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Adrian Hilton,et al.  A survey of advances in vision-based human motion capture and analysis , 2006, Comput. Vis. Image Underst..

[5]  David A. Forsyth,et al.  Tracking People by Learning Their Appearance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  Rama Chellappa,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Matching Shape Sequences in Video with Applications in Human Movement Analysis. Ieee Transactions on Pattern Analysis and Machine Intelligence 2 , 2022 .

[8]  Jesús Martínez del Rincón,et al.  2D silhouette and 3D skeletal models for human detection and tracking , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[9]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[10]  Camillo J. Taylor,et al.  Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image , 2000, Comput. Vis. Image Underst..

[11]  Mansoor Sarhadi,et al.  Non-linear statistical models for the 3D reconstruction of human pose and motion from monocular image sequences , 2000, Image Vis. Comput..

[12]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  David C. Hogg,et al.  Learning Flexible Models from Image Sequences , 1994, ECCV.

[14]  Tieniu Tan,et al.  Recent developments in human motion analysis , 2003, Pattern Recognit..

[15]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[16]  Lixin Fan,et al.  Pedestrian registration in static images with unconstrained background , 2003, Pattern Recognit..

[17]  Josechu J. Guerrero,et al.  Viewpoint Independent Human Motion Analysis in Man-made Environments , 2006, BMVC.

[18]  Timothy F. Cootes,et al.  Building optimal 2D statistical shape models , 2003, Image Vis. Comput..

[19]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Ramakant Nevatia,et al.  Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Tieniu Tan,et al.  People tracking based on motion model and motion constraints with automatic initialization , 2004, Pattern Recognit..

[22]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[23]  R. Y. Tsai,et al.  An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision , 1986, CVPR 1986.

[24]  Vincent Lepetit,et al.  Human body pose detection using Bayesian spatio-temporal templates , 2006, Comput. Vis. Image Underst..

[25]  Rómer Rosales,et al.  Estimating 3D body pose using uncalibrated cameras , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26]  Jitendra Malik,et al.  Recovering 3D human body configurations using shape contexts , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  D. Huttenlocher,et al.  A unified spatio-temporal articulated model for tracking , 2004, CVPR 2004.

[28]  Carlos Orrite-Uruñuela,et al.  2D silhouette and 3D skeletal models for human detection and tracking , 2004, ICPR 2004.

[29]  Michael J. Black,et al.  Implicit Probabilistic Models of Human Motion for Synthesis and Tracking , 2002, ECCV.

[30]  David Zhang,et al.  Human gait recognition by the fusion of motion and static spatio-temporal templates , 2007, Pattern Recognit..

[31]  Edwin R. Hancock,et al.  Learning mixtures of point distribution models with the EM algorithm , 2003, Pattern Recognit..

[32]  Stephen J. Maybank,et al.  Fusion of Multiple Tracking Algorithms for Robust People Tracking , 2002, ECCV.

[33]  Jesús Martínez del Rincón,et al.  Human Figure Segmentation Using Independent Component Analysis , 2005, IbPRIA.

[34]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[35]  Dariu Gavrila,et al.  A Bayesian Framework for Multi-cue 3D Object Tracking , 2004, ECCV.

[36]  Timothy F. Cootes,et al.  A mixture model for representing shape variation , 1999, Image Vis. Comput..

[37]  Andrew Blake,et al.  A Probabilistic Exclusion Principle for Tracking Multiple Objects , 2000, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[38]  F. Xavier Roca,et al.  A Novel Approach to Generate Multiple Shape Models for Tracking Applications , 2002, AMDO.

[39]  Ralph Gross,et al.  The CMU Motion of Body (MoBo) Database , 2001 .

[40]  David C. Hogg,et al.  Wormholes in shape space: tracking through discontinuous changes in shape , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[41]  Joonki Paik,et al.  Color active shape models for tracking non-rigid objects , 2003, Pattern Recognit. Lett..

[42]  Yanxi Liu,et al.  Bayesian body localization using mixture of nonlinear shape models , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[43]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..