Face alignment in-the-wild: A Survey

Over the last two decades, face alignment or localizing fiducial facial points has received increasing attention owing to its comprehensive applications in automatic face analysis. However, such a task has proven extremely challenging in unconstrained environments due to many confounding factors, such as pose, occlusions, expression and illumination. While numerous techniques have been developed to address these challenges, this problem is still far away from being solved. In this survey, we present an up-to-date critical review of the existing literatures on face alignment, focusing on those methods addressing overall difficulties and challenges of this topic under uncontrolled conditions. Specifically, we categorize existing face alignment techniques, present detailed descriptions of the prominent algorithms within each category, and discuss their advantages and disadvantages. Furthermore, we organize special discussions on the practical aspects of face alignment in-the-wild, towards the development of a robust face alignment system. In addition, we show performance statistics of the state of the art, and conclude this paper with several promising directions for future research.

[1]  Xin Jin,et al.  Face alignment by robust discriminative Hough voting , 2016, Pattern Recognit..

[2]  Zhiao Huang,et al.  Coarse-to-fine Face Alignment with Multi-Scale Local Patch Regression , 2015, ArXiv.

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Donghoon Lee,et al.  Face alignment using cascade Gaussian process regression trees , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Maria E. Jabon,et al.  Real-time classification of evoked emotions using facial feature tracking and physiological responses , 2008, Int. J. Hum. Comput. Stud..

[6]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Larry D. Hostetler,et al.  The estimation of the gradient of a density function, with applications in pattern recognition , 1975, IEEE Trans. Inf. Theory.

[9]  Stefanos Zafeiriou,et al.  Feature-Based Lucas–Kanade and Active Appearance Models , 2015, IEEE Transactions on Image Processing.

[10]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[11]  Stefanos Zafeiriou,et al.  Active Pictorial Structures , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Xuelong Li,et al.  A Review of Active Appearance Models , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[13]  J. Gower Generalized procrustes analysis , 1975 .

[14]  C. Taylor,et al.  Accurate Regression Procedures for Active Appearance Models , 2011, BMVC 2011.

[15]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[16]  Peter Robinson,et al.  3D Constrained Local Model for rigid and non-rigid facial tracking , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Fred Nicolls,et al.  Locating Facial Features with an Extended Active Shape Model , 2008, ECCV.

[18]  David J. Kriegman,et al.  Localizing parts of faces using a consensus of exemplars , 2011, CVPR.

[19]  Jiri Matas,et al.  XM2VTSDB: The Extended M2VTS Database , 1999 .

[20]  Narendra Ahuja,et al.  Robust Visual Tracking via Structured Multi-Task Sparse Learning , 2012, International Journal of Computer Vision.

[21]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Václav Hlavác,et al.  Detector of Facial Landmarks Learned by the Structured Output SVM , 2012, VISAPP.

[23]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[24]  Zhi-Hua Zhou,et al.  A literature survey on robust and efficient eye localization in real-life scenarios , 2013, Pattern Recognit..

[25]  Junjie Yan,et al.  Learn to Combine Multiple Hypotheses for Accurate Face Alignment , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[26]  Peter Robinson,et al.  Continuous Conditional Neural Fields for Structured Regression , 2014, ECCV.

[27]  Lionel Prevost,et al.  Combining AAM coefficients with LGBP histograms in the multi-kernel SVM framework to detect facial action units , 2011, Face and Gesture 2011.

[28]  Stefanos Zafeiriou,et al.  HOG active appearance models , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[29]  Xiaoming Liu,et al.  Pose-Invariant 3D Face Alignment , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[30]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Timothy F. Cootes,et al.  Boosted Regression Active Shape Models , 2007, BMVC.

[32]  Timothy F. Cootes,et al.  Active Shape Models-Their Training and Application , 1995, Comput. Vis. Image Underst..

[33]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Paola Campadelli,et al.  A feature-based face recognition system , 2003, 12th International Conference on Image Analysis and Processing, 2003.Proceedings..

[35]  Rui Caseiro,et al.  Non-parametric Bayesian Constrained Local Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Alex Pentland,et al.  Face recognition using eigenfaces , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[37]  Klaus J. Kirchberg,et al.  Robust Face Detection Using the Hausdorff Distance , 2001, AVBPA.

[38]  Stefanos Zafeiriou,et al.  A survey on face detection in the wild: Past, present and future , 2015, Comput. Vis. Image Underst..

[39]  Adam Schmidt,et al.  The put face database , 2008 .

[40]  Shaogang Gong,et al.  Audio- and Video-based Biometric Person Authentication , 1997, Lecture Notes in Computer Science.

[41]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[42]  Thomas Vetter,et al.  A morphable model for the synthesis of 3D faces , 1999, SIGGRAPH.

[43]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Thomas S. Huang,et al.  Interactive Facial Feature Localization , 2012, ECCV.

[45]  Ralph Gross,et al.  Generic vs. person specific active appearance models , 2005, Image Vis. Comput..

[46]  Rui Caseiro,et al.  Generative face alignment through 2.5D active appearance models , 2013, Comput. Vis. Image Underst..

[47]  Bjarne K. Ersbøll,et al.  FAME-a flexible appearance modeling environment , 2003, IEEE Transactions on Medical Imaging.

[48]  Ioannis Patras,et al.  Face Parts Localization Using Structured-Output Regression Forests , 2012, ACCV.

[49]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[50]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[51]  Qingshan Liu,et al.  A Component Based Deformable Model for Generalized Face Alignment , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[52]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Simon Baker,et al.  Equivalence and efficiency of image alignment algorithms , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[54]  Trevor Darrell,et al.  Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[55]  David J. Kriegman,et al.  Localizing Parts of Faces Using a Consensus of Exemplars , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Maja Pantic,et al.  Generic Active Appearance Models Revisited , 2012, ACCV.

[58]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[59]  Maja Pantic,et al.  Fully Automatic Recognition of the Temporal Phases of Facial Actions , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[60]  Yang Wang,et al.  Enforcing convexity for improved alignment with constrained local models , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[61]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[62]  Xi Zhao,et al.  An efficient multimodal 2D + 3D feature-based approach to automatic facial expression recognition , 2015, Comput. Vis. Image Underst..

[63]  Stefanos Zafeiriou,et al.  300 Faces in-the-Wild Challenge: The First Facial Landmark Localization Challenge , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[64]  Liya Ding,et al.  Features versus Context: An Approach for Precise and Detailed Detection and Delineation of Faces and Facial Features , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Heng Yang,et al.  Facial feature point detection: A comprehensive survey , 2014, Neurocomputing.

[66]  Stefanos Zafeiriou,et al.  Robust Discriminative Response Map Fitting with Constrained Local Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[67]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[68]  Takeo Kanade,et al.  Real-time combined 2D+3D active appearance models , 2004, CVPR 2004.

[69]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[70]  Shiguang Shan,et al.  Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[71]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[73]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[74]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[75]  S. Goldsack,et al.  IN REAL-TIME , 2008 .

[76]  Albert Ali Salah,et al.  A Statistical Method for 2-D Facial Landmarking , 2012, IEEE Transactions on Image Processing.

[77]  Nicu Sebe,et al.  Regressing a 3D Face Shape from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[78]  Petros Maragos,et al.  Adaptive and constrained algorithms for inverse compositional Active Appearance Model fitting , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[79]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[80]  George Trigeorgis,et al.  Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[81]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[82]  Jian Sun,et al.  Face Alignment Via Component-Based Discriminative Search , 2008, ECCV.

[83]  Michel F. Valstar,et al.  Guided Unsupervised Learning of Mode Specific Models for Facial Point Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[84]  Bülent Sankur,et al.  A comparative study of face landmarking techniques , 2013, EURASIP J. Image Video Process..

[85]  Hanjiang Lai,et al.  Deep Recurrent Regression for Facial Landmark Detection , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[86]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[87]  Maja Pantic,et al.  Coupled Gaussian Process Regression for Pose-Invariant Facial Expression Recognition , 2010, ECCV.

[88]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[89]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[90]  Sridha Sridharan,et al.  Fourier Lucas-Kanade Algorithm , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[91]  Peter Robinson,et al.  An Empirical Study of Recent Face Alignment Methods , 2015, ArXiv.

[92]  Maja Pantic,et al.  Gauss-Newton Deformable Part Models for Face Alignment In-the-Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[93]  B. Heisele Face Detection , 2001 .

[94]  Václav Hlavác,et al.  Real-time multi-view facial landmark detector learned by the structured output SVM , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[95]  Qiang Ji,et al.  Shape Augmented Regression for 3D Face Alignment , 2016, ECCV Workshops.

[96]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[97]  Mikkel B. Stegmann,et al.  Object tracking using active appearance models , 2001 .

[98]  Georgios Tzimiropoulos,et al.  Two-Stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge , 2016, ECCV Workshops.

[99]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[100]  Paul A. Bromiley,et al.  Robust and Accurate Shape Model Matching Using Random Forest Regression-Voting , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[101]  Luciano Silva,et al.  3D Face Alignment in the Wild: A Landmark-Free, Nose-Based Approach , 2016, ECCV Workshops.

[102]  Jörgen Ahlberg Using the active appearance algorithm for face and facial feature tracking , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[103]  Stefanos Zafeiriou,et al.  A Semi-automatic Methodology for Facial Landmark Annotation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[104]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[105]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[106]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[107]  Timothy F. Cootes,et al.  Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.

[108]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[109]  Shih-Chieh Huang,et al.  Regressive Tree Structured Model for Facial Landmark Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[110]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[111]  Takeo Kanade,et al.  A Generative Shape Regularization Model for Robust Face Alignment , 2008, ECCV.

[112]  Hanjiang Lai,et al.  Deep Cascaded Regression for Face Alignment , 2015 .

[113]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[114]  Ashish Kapoor,et al.  Real-time, fully automatic upper facial feature tracking , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[115]  Takeo Kanade,et al.  Dense 3D face alignment from 2D videos in real-time , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[116]  Feng Zhou,et al.  Exemplar-Based Graph Matching for Robust Facial Landmark Localization , 2013, 2013 IEEE International Conference on Computer Vision.

[117]  Maja Pantic,et al.  Active Orientation Models for Face Alignment In-the-Wild , 2014, IEEE Transactions on Information Forensics and Security.

[118]  Timothy F. Cootes,et al.  Active Shape Model Search using Local Grey-Level Models: A Quantitative Evaluation , 1993, BMVC.

[119]  Aleix M. Martínez,et al.  Active Appearance Models with Rotation Invariant Kernels , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[120]  Zhe L. Lin,et al.  Nonparametric Context Modeling of Local Appearance for Pose- and Expression-Robust Facial Landmark Localization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[121]  Fernando De la Torre,et al.  Global supervised descent method , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[122]  Thomas Vetter,et al.  Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[123]  Xiaoou Tang,et al.  Learning Deep Representation for Face Alignment with Auxiliary Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[124]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[125]  Junzhou Huang,et al.  Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model , 2013, 2013 IEEE International Conference on Computer Vision.

[126]  Qiang Ji,et al.  Discriminative Deep Face Shape Model for Facial Point Detection , 2015, International Journal of Computer Vision.

[127]  Ioannis Patras,et al.  Sieving Regression Forest Votes for Facial Feature Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[128]  Xiaoming Liu,et al.  Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[129]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[130]  Maja Pantic,et al.  Local Evidence Aggregation for Regression-Based Facial Point Detection , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[131]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[132]  Peter K. Allen,et al.  Articulated Pose Estimation Using Hierarchical Exemplar-Based Models , 2016, AAAI.

[133]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[134]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[135]  Timothy F. Cootes,et al.  Automatic Interpretation and Coding of Face Images Using Flexible Models , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[136]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[137]  Georgios Tzimiropoulos,et al.  Project-Out Cascaded Regression with an application to face alignment , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[138]  Federica Marcolin,et al.  3D human face description: landmarks measures and geometrical features , 2012, Image Vis. Comput..

[139]  Maja Pantic,et al.  Optimization Problems for Fast AAM Fitting in-the-Wild , 2013, 2013 IEEE International Conference on Computer Vision.

[140]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[141]  Yuning Jiang,et al.  Extensive Facial Landmark Localization with Coarse-to-Fine Convolutional Network Cascade , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[142]  Heng Yang,et al.  Privileged information-based conditional regression forest for facial feature detection , 2013, 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[143]  Xiaoming Liu,et al.  Generic Face Alignment using Boosted Appearance Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[144]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[145]  Roland Göcke,et al.  A Nonlinear Discriminative Approach to AAM Fitting , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[146]  Peter Robinson,et al.  Constrained Local Neural Fields for Robust Facial Landmark Detection in the Wild , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[147]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[148]  Maja Pantic,et al.  Facial point detection using boosted regression and graph models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[149]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[150]  Beat Fasel,et al.  Automati Fa ial Expression Analysis: A Survey , 1999 .

[151]  Nicu Sebe,et al.  The First 3D Face Alignment in the Wild (3DFAW) Challenge , 2016, ECCV Workshops.