Boosted deformable model for human body alignment

This paper studies image alignment, the problem of learning a shape and appearance model from labeled data and efficiently fitting the model to a non-rigid object with large variations. Given a set of images with manually labeled landmarks, our model representation consists of a shape component represented by a point distribution model and an appearance component represented by a collection of local features, trained discriminatively as a two-class classifier using boosting. Images with ground truth landmarks are the positive training samples while those with perturbed landmarks are considered as negatives. Enabled by piece-wise affine warping, corresponding local feature positions across all training samples form a hypothesis space for boosting. Image alignment is performed by maximizing the boosted classifier score, which is our distance measure, through iteratively mapping the feature positions to the image, and computing the gradient direction of the score with respect to the shape parameter. We apply this approach to human body alignment from surveillance-type images. We conduct experiments on the MIT pedestrian database where the body size is approximately 110 times 46 pixels, and demonstrate our real-time alignment capability.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Shai Avidan,et al.  Support vector tracking , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Gang Hua,et al.  Learning to estimate human pose with data driven belief propagation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Jiebo Luo,et al.  Body Localization in Still Images Using Hierarchical Models and Hybrid Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[6]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[7]  Mun Wai Lee,et al.  A model-based approach for estimating human 3D poses in static images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[9]  Takeo Kanade,et al.  Learning GMRF Structures for Spatial Priors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ting Yu,et al.  Gradient Feature Selection for Online Boosting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Michael J. Black,et al.  Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Rainer Lienhart,et al.  Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection , 2003, DAGM-Symposium.

[13]  Xiaoming Liu,et al.  Generic Face Alignment using Boosted Appearance Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  J. Friedman Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .

[15]  Tomaso A. Poggio,et al.  Trainable pedestrian detection , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[16]  Timothy F. Cootes,et al.  A Trainable Method of Parametric Shape Description , 1991, BMVC.

[17]  Ivan Laptev,et al.  Improvements of Object Detection Using Boosted Histograms , 2006, BMVC.

[18]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[19]  Fatih Murat Porikli,et al.  Integral histogram: a fast way to extract histograms in Cartesian spaces , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[21]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[22]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[23]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[24]  Pedro F. Felzenszwalb Representation and detection of deformable shapes , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[25]  Timothy F. Cootes,et al.  Trainable method of parametric shape description , 1992, Image Vis. Comput..

[26]  Cristian Sminchisescu,et al.  Training Deformable Models for Localization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Jitendra Malik,et al.  Recovering human body configurations using pairwise constraints between parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[28]  Yair Weiss,et al.  Learning object detection from a small number of examples: the importance of good features , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..