Real-time multi-view facial landmark detector learned by the structured output SVM

We propose a real-time multi-view landmark detector based on Deformable Part Models (DPM). The detector is composed of a mixture of tree based DPMs, each component describing landmark configurations in a specific range of viewing angles. The usage of view specific DPMs allows to capture a large range of poses and to deal with the problem of self-occlusions. Parameters of the detector are learned from annotated examples by the Structured Output Support Vector Machines algorithm. The learning objective is directly related to the performance measure used for detector evaluation. The tree based DPM allows to find a globally optimal landmark configuration by the dynamic programming. We propose a coarse-to-fine search strategy which allows real-time processing by the dynamic programming also on high resolution images. Empirical evaluation on "in the wild" images shows that the proposed detector is competitive with the state-of-the-art methods in terms of speed and accuracy yet it keeps the guarantee of finding a globally optimal estimate in contrast to other methods. Real-time multi-view landmark detector independent on initializationDetector composed of a mixture of tree based Deformable Part ModelsLandmark detector learned by the Structured Output Support Vector MachinesCoarse-to-fine strategy to speed up inference based on dynamic programming

[1]  Dima Damen,et al.  Proceedings of the British Machine Vision Conference , 2014, BMVC 2014.

[2]  Maja Pantic,et al.  Facial point detection using boosted regression and graph models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Stefanos Zafeiriou,et al.  A Semi-automatic Methodology for Facial Landmark Annotation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[4]  Donghoon Lee,et al.  Joint Estimation of Pose and Face Landmark , 2014, ACCV.

[5]  Enver Sangineto,et al.  Pose and Expression Independent Facial Landmark Localization Using Dense-SURF and the Hausdorff Distance , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Václav Hlavác,et al.  Detector of Facial Landmarks Learned by the Structured Output SVM , 2012, VISAPP.

[8]  Andrew Zisserman,et al.  Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video , 2006, BMVC.

[9]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[11]  Maja Pantic,et al.  Fully automatic facial feature point detection using Gabor feature based boosted classifiers , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[12]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Thomas Vetter,et al.  Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Junzhou Huang,et al.  Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  M MartínezAleix Recognizing Imprecisely Localized, Partially Occluded, and Expression Variant Faces from a Single Sample per Class , 2002 .

[18]  Roland Göcke,et al.  A Nonlinear Discriminative Approach to AAM Fitting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[19]  Bülent Sankur,et al.  Robust facial landmarking for registration , 2007, Ann. des Télécommunications.

[20]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[21]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[22]  Václav Hlavác,et al.  Real-time multi-view facial landmark detector learned by the structured output SVM , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[23]  Jiri Matas,et al.  Feature-based affine-invariant localization of faces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yang Wang,et al.  Enforcing convexity for improved alignment with constrained local models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Roland Göcke,et al.  Learning AAM fitting through simulation , 2009, Pattern Recognition.

[26]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[27]  David Cristinacce,et al.  Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[28]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[29]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[30]  Daniel P. Huttenlocher,et al.  Distance Transforms of Sampled Functions , 2012, Theory Comput..

[31]  Nicu Sebe,et al.  Best of Automatic Face and Gesture Recognition 2008 , 2010, Image Vis. Comput..

[32]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[33]  Fred Nicolls,et al.  Active shape models with SIFT descriptors and MARS , 2015, 2014 International Conference on Computer Vision Theory and Applications (VISAPP).

[34]  Maja Pantic,et al.  Gauss-Newton Deformable Part Models for Face Alignment In-the-Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[36]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[37]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[38]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Maja Pantic,et al.  Local Evidence Aggregation for Regression-Based Facial Point Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Ohad Shamir,et al.  Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.

[41]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[43]  Andrew Zisserman,et al.  “Who are you?” - Learning person specific classifiers from video , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[45]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[46]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[47]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Pietro Perona,et al.  Cascaded pose regression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[51]  Jiri Matas,et al.  A 3D Approach to Facial Landmarks: Detection, Refinement, and Tracking , 2014, 2014 22nd International Conference on Pattern Recognition.

[52]  Alexander J. Smola,et al.  Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[53]  Stefanos Zafeiriou,et al.  Robust Discriminative Response Map Fitting with Constrained Local Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Václav Hlavác,et al.  Ten Lectures on Statistical and Structural Pattern Recognition , 2002, Computational Imaging and Vision.

[55]  Jiri Matas,et al.  WaldBoost - learning for time constrained sequential detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[56]  Lance Williams,et al.  Pyramidal parametrics , 1983, SIGGRAPH.

[57]  Aleix M. Martínez,et al.  Recognizing Imprecisely Localized, Partially Occluded, and Expression Variant Faces from a Single Sample per Class , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Sören Sonnenburg,et al.  COFFIN: A Computational Framework for Linear SVMs , 2010, ICML.

[59]  Vajirasak Vanijja,et al.  Modified Deformable Parts Model for Side Profile Facial Feature Detection , 2013, IAIT.