Estimating Correspondences of Deformable Objects “In-the-Wild”

During the past few years we have witnessed the development of many methodologies for building and fitting Statistical Deformable Models (SDMs). The construction of accurate SDMs requires careful annotation of images with regards to a consistent set of landmarks. However, the manual annotation of a large amount of images is a tedious, laborious and expensive procedure. Furthermore, for several deformable objects, e.g. human body, it is difficult to define a consistent set of landmarks, and, thus, it becomes impossible to train humans in order to accurately annotate a collection of images. Nevertheless, for the majority of objects, it is possible to extract the shape by object segmentation or even by shape drawing. In this paper, we show for the first time, to the best of our knowledge, that it is possible to construct SDMs by putting object shapes in dense correspondence. Such SDMs can be built with much less effort for a large battery of objects. Additionally, we show that, by sampling the dense model, a part-based SDM can be learned with its parts being in correspondence. We employ our framework to develop SDMs of human arms and legs, which can be used for the segmentation of the outline of the human body, as well as to provide better and more consistent annotations for body joints.

[1]  Timothy F. Cootes,et al.  Groupwise Diffeomorphic Non-rigid Registration for Automatic Model Building , 2004, ECCV.

[2]  Ben Taskar,et al.  MODEC: Multimodal Decomposable Models for Human Pose Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[4]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[5]  Nikos Paragios,et al.  Shape registration in implicit spaces using information theory and free form deformations , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Fatih Murat Porikli,et al.  Support Vector Shape: A Classifier-Based Shape Representation , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Brendan J. Frey,et al.  Learning appearance and transparency manifolds of occluded objects in layers , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Andrew Zisserman,et al.  Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos , 2014, ACCV.

[9]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[10]  Timothy F. Cootes,et al.  Automatic Construction of Parts+Geometry Models for Initializing Groupwise Registration , 2012, IEEE Transactions on Medical Imaging.

[11]  Nebojsa Jojic,et al.  Escaping local minima through hierarchical model selection: Automatic object discovery, segmentation, and tracking in video , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Xiaogang Wang,et al.  Pedestrian Parsing via Deep Decompositional Network , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Stefanos Zafeiriou,et al.  Feature-Based Lucas–Kanade and Active Appearance Models , 2015, IEEE Transactions on Image Processing.

[15]  Georgios Tzimiropoulos,et al.  Project-Out Cascaded Regression with an application to face alignment , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[18]  Maja Pantic,et al.  Optimization Problems for Fast AAM Fitting in-the-Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  David J. Kriegman,et al.  Localizing parts of faces using a consensus of exemplars , 2011, CVPR.

[20]  Fred L. Bookstein,et al.  Principal Warps: Thin-Plate Splines and the Decomposition of Deformations , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[22]  Andrew Zisserman,et al.  Flowing ConvNets for Human Pose Estimation in Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Iasonas Kokkinos,et al.  Unsupervised Learning of Object Deformation Models , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[24]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[25]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jeff G. Schneider,et al.  Automatic construction of active appearance models as an image coding problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Andrew Zisserman,et al.  Upper Body Detection and Tracking in Extended Signing Sequences , 2011, International Journal of Computer Vision.

[29]  Changsheng Xu,et al.  Matching-CNN meets KNN: Quasi-parametric human parsing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  C. Schmid,et al.  Learning shape prior models for object matching , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[32]  Stefanos Zafeiriou,et al.  Robust Discriminative Response Map Fitting with Constrained Local Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Stefanos Zafeiriou,et al.  Bayesian Active Appearance Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Stefanos Zafeiriou,et al.  Menpo: A Comprehensive Platform for Parametric Image Alignment and Visual Deformable Models , 2014, ACM Multimedia.

[37]  Luc Van Gool,et al.  Human Pose Estimation Using Body Parts Dependent Joint Regressors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Andrew Zisserman,et al.  Domain Adaptation for Upper Body Pose Tracking in Signed TV Broadcasts , 2013, BMVC.

[39]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[40]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Deva Ramanan,et al.  Increasing the density of Active Appearance Models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Stefanos Zafeiriou,et al.  Unifying holistic and Parts-Based Deformable Model fitting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[44]  Stefanos Zafeiriou,et al.  300 Faces In-The-Wild Challenge: database and results , 2016, Image Vis. Comput..

[45]  Thomas Vetter,et al.  On compositional Image Alignment, with an application to Active Appearance Models , 2009, CVPR.

[46]  Maja Pantic,et al.  Gauss-Newton Deformable Part Models for Face Alignment In-the-Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[47]  Petros Maragos,et al.  Adaptive and constrained algorithms for inverse compositional Active Appearance Model fitting , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Sami Romdhani,et al.  Optimal Step Nonrigid ICP Algorithms for Surface Registration , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Stefanos Zafeiriou,et al.  Face Flow , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[51]  Stefanos Zafeiriou,et al.  Active Pictorial Structures , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Carlo Tomasi,et al.  Dense Lagrangian motion estimation with occlusions , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Permalink Unsupervised Learning of Object Deformation Models , 2007 .

[54]  Andrew Zisserman,et al.  Upper Body Pose Estimation with Temporal Sequential Forests , 2014, BMVC.

[55]  Michael J. Black,et al.  From Pictorial Structures to deformable structures , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[56]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[57]  Jonathan Tompson,et al.  Efficient object localization using Convolutional Networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[59]  Lourdes Agapito,et al.  A Variational Approach to Video Registration with Subspace Constraints , 2013, International Journal of Computer Vision.

[60]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Xiaoming Liu,et al.  Simultaneous alignment and clustering for an image ensemble , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[65]  Luc Van Gool,et al.  Body Parts Dependent Joint Regressors for Human Pose Estimation in Still Images , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[66]  Björn Stenger,et al.  Using Bounded Diameter Minimum Spanning Trees to Build Dense Active Appearance Models , 2014, International Journal of Computer Vision.

[67]  Stefanos Zafeiriou,et al.  Automatic Construction of Deformable Models In-the-Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.