Learning Lightweight Face Detector with Knowledge Distillation

Despite that face detection has progressed significantly in recent years, it is still a challenging task to get a fast face detector with competitive performance, especially on CPU based devices. In this paper, we propose a novel loss function based on knowledge distillation to boost the performance of lightweight face detectors. More specifically, a student detector learns additional soft label from a teacher detector by mimicking its classification map. To make the knowledge transfer more efficient, a threshold function is designed to assign threshold values adaptively for different objectness scores such that only the informative samples are used for mimicking. Experiments on FDDB and WIDER FACE show that the proposed method improves the performance of face detectors consistently. With the help of the proposed training method, we get a CPU real-time face detector that runs at 20 FPS while being state-of-the-art on performance among CPU based detectors.

[1]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[2]  Shengcai Liao,et al.  A Fast and Accurate Unconstrained Face Detector , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Marios Savvides,et al.  CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection , 2016, ArXiv.

[4]  Charless C. Fowlkes,et al.  Occlusion Coherence: Detecting and Localizing Occluded Faces , 2015, ArXiv.

[5]  Bin Yang,et al.  Aggregate channel features for multi-view face detection , 2014, IEEE International Joint Conference on Biometrics.

[6]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[7]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Shuo Yang,et al.  Face Detection through Scale-Friendly Deep Convolutional Networks , 2017, ArXiv.

[9]  Hao Wang,et al.  Detecting Faces Using Inside Cascaded Contextual CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Zhaoxiang Zhang,et al.  DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer , 2017, AAAI.

[11]  Tony X. Han,et al.  Learning Efficient Object Detection Models with Knowledge Distillation , 2017, NIPS.

[12]  Junjie Yan,et al.  The Fastest Deformable Part Model for Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Xiaolin Hu,et al.  Joint Training of Cascaded CNN for Face Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Luc Van Gool,et al.  Face Detection without Bells and Whistles , 2014, ECCV.

[15]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Erik Learned-Miller,et al.  FDDB: A benchmark for face detection in unconstrained settings , 2010 .

[17]  Bin Yang,et al.  Convolutional Channel Features , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Jian Sun,et al.  Joint Cascade Face Detection and Alignment , 2014, ECCV.

[19]  Shuo Yang,et al.  From Facial Parts Responses to Face Detection: A Deep Learning Approach , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Xiangyu Zhu,et al.  Face Alignment in Full Pose Range: A 3D Total Solution , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Junjie Yan,et al.  Mimicking Very Efficient Network for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Mohan M. Trivedi,et al.  To boost or not to boost? On the limits of boosted trees for object detection , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[23]  Shifeng Zhang,et al.  S^3FD: Single Shot Scale-Invariant Face Detector , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Shifeng Zhang,et al.  FaceBoxes: A CPU real-time face detector with high accuracy , 2017, 2017 IEEE International Joint Conference on Biometrics (IJCB).

[25]  Vishal M. Patel,et al.  DAFE-FD: Density Aware Feature Enrichment for Face Detection , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[26]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Xiangyu Zhu,et al.  High-fidelity Pose and Expression Normalization for face recognition in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[29]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[30]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Peiyun Hu,et al.  Finding Tiny Faces , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Vladimir Pavlovic,et al.  Face tracking and recognition with visual constraints in real-world videos , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  C. V. Jawahar,et al.  Visual Phrases for Exemplar Face Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Yu Liu,et al.  Recurrent Scale Approximation for Object Detection in CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[36]  Yizhou Wang,et al.  Face Detection with End-to-End Integration of a ConvNet and a 3D Model , 2016, ECCV.

[37]  Larry S. Davis,et al.  SSH: Single Stage Headless Face Detector , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).