Deep Context-Sensitive Facial Landmark Detection With Tree-Structured Modeling

Facial landmark detection is typically cast as a point-wise regression problem that focuses on how to build an effective image-to-point mapping function. In this paper, we propose an end-to-end deep learning approach for contextually discriminative feature construction together with effective facial structure modeling. The proposed learning approach is able to predict more contextually discriminative facial landmarks by capturing their associated contextual information. Moreover, we present a tree model to characterize human face structure and a structural loss function to measure the deformation cost between the ground-truth and predicted tree model, which are further incorporated into the proposed learning approach and jointly optimized within a unified framework. The presented tree model is able to well characterize the spatial layout patterns of facial landmarks for capturing the facial structure information. Experimental results demonstrate the effectiveness of the proposed approach against the state-of-the-art over the MTFL and AFLW-full data sets.

[1]  Liang Lin,et al.  Unconstrained Facial Landmark Localization with Backbone-Branches Fully-Convolutional Networks , 2015, ArXiv.

[2]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Alan L. Yuille,et al.  Semi-Supervised Sparse Representation Based Classification for Face Recognition With Insufficient Labeled Samples , 2016, IEEE Transactions on Image Processing.

[5]  Jiwen Lu,et al.  Simultaneous Feature and Dictionary Learning for Image Set Based Face Recognition , 2014, IEEE Transactions on Image Processing.

[6]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Georgios Tzimiropoulos,et al.  Project-Out Cascaded Regression with an application to face alignment , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[10]  David Cristinacce,et al.  Automatic feature localisation with constrained local models , 2008, Pattern Recognit..

[11]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[14]  Hong Wang,et al.  Face Recognition via Active Annotation and Learning , 2016, ACM Multimedia.

[15]  Cheng Li,et al.  Unconstrained Face Alignment via Cascaded Compositional Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Shiguang Shan,et al.  A Fully End-to-End Cascaded CNN for Facial Landmark Detection , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Jian Sun,et al.  Face Alignment via Regressing Local Binary Features , 2016, IEEE Transactions on Image Processing.

[21]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Shiguang Shan,et al.  Occlusion-Free Face Alignment: Deep Regression Networks Coupled with De-Corrupt AutoEncoders , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[25]  Yong Du,et al.  Facial Expression Recognition Based on Deep Evolutional Spatial-Temporal Networks , 2017, IEEE Transactions on Image Processing.

[26]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  David J. Kriegman,et al.  Localizing parts of faces using a consensus of exemplars , 2011, CVPR.

[28]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Thomas Vetter,et al.  Optimal landmark detection using shape models and branch and bound , 2011, 2011 International Conference on Computer Vision.

[30]  Carlos D. Castillo,et al.  UMDFaces: An annotated face dataset for training deep networks , 2016, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[31]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, CAIP.

[32]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Jian Sun,et al.  Face Alignment Via Component-Based Discriminative Search , 2008, ECCV.

[35]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  William J. Christmas,et al.  Cascaded Collaborative Regression for Robust Facial Landmark Detection Trained Using a Mixture of Synthetic and Real Images With Dynamic Weighting , 2015, IEEE Transactions on Image Processing.

[37]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Byungyong Ryu,et al.  Local Directional Ternary Pattern for Facial Expression Recognition. , 2017, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[39]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40]  Sungjin Hong,et al.  Efficient facial landmark localization using spatial-contextual AdaBoost algorithm , 2014, J. Vis. Commun. Image Represent..

[41]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[42]  Xiaogang Wang,et al.  Hierarchical face parsing via deep learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[44]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Trevor Darrell,et al.  PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[47]  George Trigeorgis,et al.  Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Qingshan Liu,et al.  Dual Sparse Constrained Cascade Regression for Robust Face Alignment , 2016, IEEE Transactions on Image Processing.

[49]  Zhe L. Lin,et al.  Nonparametric Context Modeling of Local Appearance for Pose- and Expression-Robust Facial Landmark Localization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Junzhou Huang,et al.  Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model , 2013, 2013 IEEE International Conference on Computer Vision.