KEPLER: Keypoint and Pose Estimation of Unconstrained Faces by Learning Efficient H-CNN Regressors

Keypoint detection is one of the most importantpre-processing steps in tasks such as face modeling, recognitionand verification. In this paper, we present an iterative methodfor Keypoint Estimation and Pose prediction of unconstrainedfaces by Learning Efficient H-CNN Regressors (KEPLER) foraddressing the face alignment problem. Recent state of the artmethods have shown improvements in face keypoint detectionby employing Convolution Neural Networks (CNNs). Althougha simple feed forward neural network can learn the mappingbetween input and output spaces, it cannot learn the inherentstructural dependencies. We present a novel architecture calledH-CNN (Heatmap-CNN) which captures structured global andlocal features and thus favors accurate keypoint detecion. H-CNNis jointly trained on the visibility, fiducials and 3D-pose of theface. As the iterations proceed, the error decreases making thegradients small and thus requiring efficient training of DCNNs tomitigate this. KEPLER performs global corrections in pose andfiducials for the first four iterations followed by local correctionsin a subsequent stage. As a by-product, KEPLER also provides3D pose (pitch, yaw and roll) of the face accurately. In thispaper, we show that without using any 3D information, KEPLERoutperforms state of the art methods for alignment on challengingdatasets such as AFW [38] and AFLW [17].

[1]  George Trigeorgis,et al.  Adaptive cascaded regression , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[2]  Rama Chellappa,et al.  DCNNs on a Diet: Sampling Strategies for Reducing the Training Set Size , 2016, ArXiv.

[3]  Stefanos Zafeiriou,et al.  Robust Discriminative Response Map Fitting with Constrained Local Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[5]  Jian Sun,et al.  Face Alignment Via Component-Based Discriminative Search , 2008, ECCV.

[6]  Jian Sun,et al.  Face Alignment at 3000 FPS via Regressing Local Binary Features , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Shiguang Shan,et al.  Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[8]  Cheng Li,et al.  Towards Arbitrary-View Face Alignment by Recommendation Trees , 2015, ArXiv.

[9]  Jitendra Malik,et al.  Human Pose Estimation with Iterative Error Feedback , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  José Miguel Buenaposada,et al.  Head-Pose Estimation In-the-Wild Using a Random Forest , 2016, AMDO.

[13]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jian Sun,et al.  Face Alignment by Explicit Shape Regression , 2012, International Journal of Computer Vision.

[15]  Xiaoming Liu,et al.  Pose-Invariant 3D Face Alignment , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Stefanos Zafeiriou,et al.  Incremental Face Alignment in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Xiaoming Liu,et al.  Large-Pose Face Alignment via CNN-Based Dense 3D Model Fitting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Donghoon Lee,et al.  Face alignment using cascade Gaussian process regression trees , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Anil K. Jain,et al.  Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Maja Pantic,et al.  Gauss-Newton Deformable Part Models for Face Alignment In-the-Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Stefanos Zafeiriou,et al.  Active Pictorial Structures , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Shih-Chieh Huang,et al.  Regressive Tree Structured Model for Facial Landmark Localization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[27]  Xiaogang Wang,et al.  Deep Convolutional Network Cascade for Facial Point Detection , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[29]  W. Marsden I and J , 2012 .

[30]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  George Trigeorgis,et al.  Mnemonic Descent Method: A Recurrent Process Applied for End-to-End Face Alignment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Qiang Ji,et al.  Robust Facial Landmark Detection Under Significant Head Poses and Occlusion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Fernando De la Torre,et al.  Global supervised descent method , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[35]  Junzhou Huang,et al.  Pose-Free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Cheng Li,et al.  Unconstrained Face Alignment via Cascaded Compositional Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Rama Chellappa,et al.  Face Alignment by Local Deep Descriptor Regression , 2016, ArXiv.

[39]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.