Efficient face detection and tracking in video sequences based on deep learning

Abstract Video-based face detection and tracking technology has been widely used in video surveillance, safe driving, and medical diagnosis. In video sequences, most existing face detection and tracking methods face interference caused by occlusion, ambient illumination, and changes in human posture. To accurately track human faces in video sequences, we propose an efficient face detection and tracking framework based on deep learning, which includes a SENResNet face detection model and a Regression Network-based Face Tracking (RNFT) model. Firstly, the SENResNet model integrates the Squeeze and Excitation Network (SEN) with the Residual Neural Network (ResNet). To solve the problem that deep neural networks are difficult to train, we use ResNet to overcome the problem of gradient disappearance in deep network training. To fuse the features of each channel during the convolution operation, we further integrate the SEN module into the SENResNet model. SENResNet accurately detects facial information in each frame and extracts the position of the target face, thereby providing an initialization window for face tracking. Then, the RNFT model extracts facial features from adjacent frames and predict the position of the target face in the next frame. To address the problem of feature scaling, we add a correction network to the RNFT model. The improved RNFT model extracts the rectangular frame of the target face in the previous frame and strengthens the perception of feature scaling, thereby improving its accuracy. Extensive experimental results on public facial and video datasets show that the proposed SENResNet and RNFT models are superior to the state-of-the-art comparison methods in terms of accuracy and performance.

[1]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[2]  Enhong Chen,et al.  Attentive One-Dimensional Heatmap Regression for Facial Landmark Detection and Tracking , 2020, ACM Multimedia.

[3]  Qian Chen,et al.  Multiple face tracking and recognition with identity-specific localized metric learning , 2018, Pattern Recognit..

[4]  Stan Z. Li,et al.  Single-Shot Scale-Aware Network for Real-Time Face Detection , 2019, International Journal of Computer Vision.

[5]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Angelo M. Sabatini,et al.  A Novel Kalman Filter for Human Motion Tracking With an Inertial-Based Dynamic Inclinometer , 2015, IEEE Transactions on Biomedical Engineering.

[7]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Bo Li,et al.  A Fast Face Detection Method via Convolutional Neural Network , 2018, Neurocomputing.

[10]  Fei Peng,et al.  Face spoofing detection based on color texture Markov feature and support vector machine recursive feature elimination , 2018, J. Vis. Commun. Image Represent..

[11]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Zhichao Lian,et al.  A Real Time Face Tracking System based on Multiple Information Fusion , 2020, Multimedia Tools and Applications.

[13]  Qingshan Liu,et al.  Robust facial landmark tracking via cascade regression , 2017, Pattern Recognit..

[14]  Anil K. Jain,et al.  Face Detection in Color Images , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Sang-Rok Oh,et al.  Adaptive Mean Shift Based Face Tracking by Coupled Support Map , 2017, Int. J. Fuzzy Log. Intell. Syst..

[17]  Shifeng Zhang,et al.  S^3FD: Single Shot Scale-Invariant Face Detector , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Sébastien Marcel,et al.  Alternative search techniques for face detection using location estimation and binary features , 2013, Comput. Vis. Image Underst..

[19]  Philip S. Yu,et al.  A Bi-layered Parallel Training Architecture for Large-Scale Convolutional Neural Networks , 2018, IEEE Transactions on Parallel and Distributed Systems.

[20]  Yaser Sheikh,et al.  Deep incremental learning for efficient high-fidelity face tracking , 2018, ACM Trans. Graph..

[21]  Ren Jianxin,et al.  Real-time Tracking of Non-rigid Objects , 2016, ICCIS '16.

[22]  Ming-Hsuan Yang,et al.  UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking , 2015, Comput. Vis. Image Underst..

[23]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Deng-Yuan Huang,et al.  High-efficiency face detection and tracking method for numerous pedestrians through face candidate generation , 2021, Multim. Tools Appl..

[25]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Dit-Yan Yeung,et al.  Learning a Deep Compact Image Representation for Visual Tracking , 2013, NIPS.

[27]  Steven C. H. Hoi,et al.  Face Detection using Deep Learning: An Improved Faster RCNN Approach , 2017, Neurocomputing.

[28]  Shahrel Azmin Suandi,et al.  Hierarchical Skin-AdaBoost-Neural Network (H-SKANN) for multi-face detection , 2018, Appl. Soft Comput..

[29]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Nan Yang,et al.  A disease diagnosis and treatment recommendation system based on big data mining and cloud computing , 2018, Inf. Sci..

[31]  Jesse S. Jin,et al.  Tracking Using CamShift Algorithm and Multiple Quantized Feature Spaces , 2004, VIP.

[32]  Xiaolin Hu,et al.  Scale-Aware Face Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Victor H. Diaz-Ramirez,et al.  Facial landmark detection and tracking with dynamically adaptive matched filters , 2020, J. Electronic Imaging.

[34]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).