EfficientFAN: Deep Knowledge Transfer for Face Alignment

Face alignment plays an important role in many applications that process facial images. At present, deep learning-based methods have achieved excellent results in face alignment. However, these models usually have a large number of parameters, resulting in high computational complexity and execution time. In this paper, a lightweight, efficient, and effective model is proposed and named Efficient Face Alignment Network (EfficientFAN). EfficientFAN adopts the encoder-decoder structure, using a simple backbone Efficient-Net-B0 as the encoder and three deconvolutional layers as the decoder. Compared with state-of-the-art models, it achieves equivalent performance with fewer model parameters, lower computation cost, and higher speed. Moreover, the accuracy of EfficientFAN is further improved by transferring deep knowledge of a complex teacher network through feature-aligned distillation and patch similarity distillation. Extensive experimental results on public data sets demonstrate the superiority of EfficientFAN over state-of-the-art methods.

[1]  Dacheng Tao,et al.  Robust Face Recognition via Multimodal Deep Face Representation , 2015, IEEE Transactions on Multimedia.

[2]  Georgios Tzimiropoulos,et al.  How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  José Miguel Buenaposada,et al.  A Deeply-Initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment , 2018, ECCV.

[6]  Cheng Li,et al.  Face alignment by coarse-to-fine shape searching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ke Chen,et al.  Structured Knowledge Distillation for Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Josef Kittler,et al.  Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Qirong Mao,et al.  Hierarchical Bayesian Theme Models for Multipose Facial Expression Recognition , 2017, IEEE Transactions on Multimedia.

[10]  Yang Zhao,et al.  MobileFAN: Transferring Deep Hidden Representation for Face Alignment , 2019, Pattern Recognit..

[11]  William J. Christmas,et al.  Dynamic Attention-Controlled Cascaded Shape Regression Exploiting Training Data Augmentation and Fuzzy-Set Sample Weighting , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Pietro Perona,et al.  Robust Face Landmark Estimation under Occlusion , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.

[14]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[15]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Yici Cai,et al.  Look at Boundary: A Boundary-Aware Face Alignment Algorithm , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Quoc V. Le,et al.  EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[19]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[20]  Hanjiang Lai,et al.  Robust Facial Landmark Detection via Recurrent Attentive-Refinement Networks , 2016, ECCV.

[21]  Junjie Yan,et al.  Mimicking Very Efficient Network for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[23]  Marek Kowalski,et al.  Deep Alignment Network: A Convolutional Neural Network for Robust Face Alignment , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Yichen Wei,et al.  Simple Baselines for Human Pose Estimation and Tracking , 2018, ECCV.

[25]  Rama Chellappa,et al.  Disentangling 3D Pose in a Dendritic CNN for Unconstrained 2D Face Alignment , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Nassir Navab,et al.  Deep Residual Learning for Instrument Segmentation in Robotic Surgery , 2017, MLMI@MICCAI.

[27]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[28]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Jiwen Lu,et al.  Learning Cascaded Deep Auto-Encoder Networks for Face Alignment , 2016, IEEE Transactions on Multimedia.

[30]  David Picard,et al.  2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Shiguang Shan,et al.  Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment , 2014, ECCV.

[32]  Matthieu Cord,et al.  DeCaFA: Deep Convolutional Cascade for Face Alignment in the Wild , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[33]  Lin Ma,et al.  PFLD: A Practical Facial Landmark Detector , 2019, ArXiv.