Efficient convolutional neural network with multi-kernel enhancement features for real-time facial expression recognition

Facial expressions are the most direct external manifestation of personal emotions. Different from other pattern recognition problems, the feature difference between facial expressions is smaller. The general methods are difficult to effectively characterize the feature difference, or their parameters are too large to realize real-time processing. This paper proposes a lightweight mobile architecture and a multi-kernel feature facial expression recognition network, which can take into account the speed and accuracy of real-time facial expression recognition. First, a multi-kernel convolution block is designed by using three depthwise separable convolution kernels of different sizes in parallel. The small and the large kernels can extract local details and edge contour information of facial expressions, respectively. Then, the multi-channel information is fused to obtain multi-kernel enhancement features to better describe the differences between facial expressions. Second, a "Channel Split" operation is performed on the input of the multi-kernel convolution block, which can avoid repeated extraction of invalid information and reduce the amount of parameters to one-third of the original. Finally, a lightweight multi-kernel feature expression recognition network is designed by alternately using multi-kernel convolution blocks and depthwise separable convolutions to further improve the feature representation ability. Experimental results show that the proposed network achieves high accuracy of 73.3 and 99.5% on FER-2013 and CK + datasets, respectively. Furthermore, it achieves a speed of 78 frames per second on 640 × 480 video. It is superior to other state-of-the-art methods in terms of speed and accuracy.

[1]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[3]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[4]  Min Chen,et al.  Facial expression recognition in dynamic sequences: An integrated approach , 2014, Pattern Recognit..

[5]  Ximei Liu,et al.  The application of scale invariant feature transform fused with shape model in the human face recognition , 2016, 2016 IEEE Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC).

[6]  Jörgen Ahlberg,et al.  Fast facial expression recognition using local binary features and shallow neural networks , 2018, The Visual Computer.

[7]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[8]  Qing Wu,et al.  An Improved Weighted Local Linear Embedding Algorithm , 2018, 2018 14th International Conference on Computational Intelligence and Security (CIS).

[9]  Dewi Yanti Liliana,et al.  Human emotion recognition based on active appearance model and semi-supervised fuzzy C-means , 2016, 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[10]  Peng Song,et al.  A Joint Convolutional Bidirectional LSTM Framework for Facial Expression Recognition , 2018, IEICE Trans. Inf. Syst..

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Hong Bao,et al.  Facial expression recognition based on video , 2016, 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).

[13]  Aurobinda Routray,et al.  A real-time robust facial expression recognition system using HOG features , 2016, 2016 International Conference on Computing, Analytics and Security Trends (CAST).

[14]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Shan Sung Liew,et al.  Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems , 2016, Neurocomputing.

[16]  P. Ekman,et al.  Constants across cultures in the face and emotion. , 1971, Journal of personality and social psychology.

[17]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[18]  Abdulmotaleb El-Saddik,et al.  A Deep Learning System for Recognizing Facial Expression in Real-Time , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[19]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[20]  Yichuan Tang,et al.  Deep Learning using Linear Support Vector Machines , 2013, 1306.0239.

[21]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[22]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[23]  Jun Yu,et al.  Deep Neural Networks with Relativity Learning for facial expression recognition , 2016, 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[24]  Tong Zhang,et al.  Spatial–Temporal Recurrent Neural Network for Emotion Recognition , 2017, IEEE Transactions on Cybernetics.

[25]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[26]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[27]  Jun-Cheol Park,et al.  A Real-time Facial Expression Recognizer using Deep Neural Network , 2016, IMCOM.

[28]  Guojiang Wang,et al.  Facial Expression Recognition Method Based on Zernike Moments and MCE Based HMM , 2016, 2016 9th International Symposium on Computational Intelligence and Design (ISCID).

[29]  Fayez W. Zaki,et al.  Intelligent Real-Time Facial Expression Recognition from Video Sequences based on Hybrid Feature Tracking Algorithms , 2017 .

[30]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  R. J. Ramteke,et al.  Facial expression recognition using wavelet transform and local binary pattern , 2017, 2017 2nd International Conference for Convergence in Technology (I2CT).

[32]  Ing Ren Tsang,et al.  FERAtt: Facial Expression Recognition With Attention Net , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[33]  Armin Lawi,et al.  Facial Expression Recognition using Multiclass Ensemble Least-Square Support Vector Machine , 2018 .

[34]  Michael Goh Kah Ong,et al.  Facial Expression Recognition Using a Hybrid CNN-SIFT Aggregator , 2017, MIWAI.

[35]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.