Pose-Invariant Face Alignment via CNN-Based Dense 3D Model Fitting

Pose-invariant face alignment is a very challenging problem in computer vision, which is used as a prerequisite for many facial analysis tasks, e.g., face recognition, expression recognition, and 3D face reconstruction. Recently, there have been a few attempts to tackle this problem, but still more research is needed to achieve higher accuracy. In this paper, we propose a face alignment method that aligns an image with arbitrary poses, by combining the powerful cascaded CNN regressors, 3D Morphable Model (3DMM), and mirrorability constraint. The core of our proposed method is a novel 3DMM fitting algorithm, where the camera projection matrix parameters and 3D shape parameters are estimated by a cascade of CNN-based regressors. Furthermore, we impose the mirrorability constraint during the CNN learning by employing a novel loss function inside the siamese network. The dense 3D shape enables us to design pose-invariant appearance features for effective CNN learning. Extensive experiments are conducted on the challenging large-pose face databases (AFLW and AFW), with comparison to the state of the art.

[1]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[2]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Xiaoming Liu,et al.  Attribute preserved face de-identification , 2015, 2015 International Conference on Biometrics (ICB).

[4]  Timothy F. Cootes,et al.  Active Shape Models: Evaluation of a Multi-Resolution Method for Improving Image Search , 1994, BMVC.

[5]  David J. Kriegman,et al.  Localizing parts of faces using a consensus of exemplars , 2011, CVPR.

[6]  Junzhou Huang,et al.  Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Maja Pantic,et al.  Facial point detection using boosted regression and graph models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Kun Zhou,et al.  Displaced dynamic expression regression for real-time facial tracking and animation , 2014, ACM Trans. Graph..

[9]  Andrew Zisserman,et al.  Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos , 2014, ACCV.

[10]  Ioannis Patras,et al.  Mirror, mirror on the wall, tell me, is the error small? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Takeo Kanade,et al.  Dense 3D face alignment from 2D videos in real-time , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[12]  Xiaoming Liu,et al.  Discriminative Face Alignment , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dimitris N. Metaxas,et al.  Consensus of Regression for Occlusion-Robust Facial Feature Localization , 2014, ECCV.

[14]  Xiaoming Liu,et al.  Video-based face model fitting using Adaptive Active Appearance Model , 2010, Image Vis. Comput..

[15]  Nicu Sebe,et al.  The First 3D Face Alignment in the Wild (3DFAW) Challenge , 2016, ECCV Workshops.

[16]  Kavita Bala,et al.  Learning visual similarity for product design with convolutional neural networks , 2015, ACM Trans. Graph..

[17]  Oliver Chiu-sing Choy,et al.  Deep sparse rectifier neural networks for speech denoising , 2016, 2016 IEEE International Workshop on Acoustic Signal Enhancement (IWAENC).

[18]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[19]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[20]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[21]  Dorin Comaniciu,et al.  Conditional density learning via regression with application to deformable shape segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Jürgen Beyerer,et al.  Adaptive Contour Fitting for Pose-Invariant 3D Face Shape Reconstruction , 2015, BMVC.

[23]  Heng Yang,et al.  Facial feature point detection: A comprehensive survey , 2014, Neurocomputing.

[24]  Yiying Tong,et al.  FaceWarehouse: A 3D Facial Expression Database for Visual Computing , 2014, IEEE Transactions on Visualization and Computer Graphics.

[25]  Takeo Kanade,et al.  Real-time combined 2D+3D active appearance models , 2004, CVPR 2004.

[26]  Shiguang Shan,et al.  Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[27]  Qiang Ji,et al.  Robust Facial Landmark Detection Under Significant Head Poses and Occlusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Hyeonjoon Moon,et al.  The FERET Evaluation Methodology for Face-Recognition Algorithms , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Georgios Tzimiropoulos,et al.  Project-Out Cascaded Regression with an application to face alignment , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Yuning Jiang,et al.  Extensive Facial Landmark Localization with Coarse-to-Fine Convolutional Network Cascade , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[32]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Yiying Tong,et al.  Adaptive 3D Face Reconstruction from Unconstrained Photo Collections , 2016, CVPR.

[34]  Xiaoming Liu,et al.  Pose-Invariant 3D Face Alignment , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[36]  Xiaoming Liu,et al.  Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[39]  Simon Lucey,et al.  Face alignment through subspace constrained mean-shifts , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[40]  Yiying Tong,et al.  Unconstrained 3D face reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Hossein Mobahi,et al.  Toward a Practical Face Recognition System: Robust Alignment and Illumination by Sparse Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Simon Baker,et al.  Active Appearance Models Revisited , 2004, International Journal of Computer Vision.

[44]  Wen Gao,et al.  Curse of mis-alignment in face recognition: problem and a novel mis-alignment learning solution , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[45]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[46]  Sami Romdhani,et al.  A 3D Face Model for Pose and Illumination Invariant Face Recognition , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[47]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[48]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[49]  Nicu Sebe,et al.  Regressing a 3D Face Shape from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Xiangyu Zhu,et al.  High-fidelity Pose and Expression Normalization for face recognition in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Shih-Chieh Huang,et al.  Regressive Tree Structured Model for Facial Landmark Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52]  Bin Yang,et al.  Convolutional Channel Features , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[53]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[54]  Xiangyu Zhu,et al.  Discriminative 3D morphable model fitting , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[55]  Thomas Vetter,et al.  Expression invariant 3D face recognition with a Morphable Model , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[56]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.