Lightweight Deep Learning Model For Facial Expression Recognition

Facial expression recognition (FER) is one of the fundamental cornerstones for many applications such as driver fatigue monitoring, social robotics, and medical treatment. It is of great challenge to recognize facial expression with high accuracy because the settle features are not easily captured. Deep learning expects to play an essential role in tackling this challenge and improving facial expression recognition accuracy. Although existing complicated deep learning models can achieve high accuracy, the computational cost is too high for resource constrained devices such as internet of thing devices. In this work, we propose a lightweight deep learning model based on MobileNetV2 and Inception, to reduce computational cost while maintaining relatively high accuracy. Specifically, the proposed model uses an Inception convolutional neural network (CNN) to extract features from inputs, and the backbone bottlenecks to compress the model and learn function efficiently; and the CNN block to expand the learned features in bottleneck layers and feed the fully-connected layer for classification. The proposed model is highly efficient, and with small size, that can be deployed on devices equipped with limited computational resources. We conduct experiments, and the results demonstrate that the proposed lightweight deep learning model can achieve relatively high accuracy for expression recognition and reduce the computational cost significantly.

[1]  Zhiyong Feng,et al.  Facial expression recognition via deep learning , 2014, 2014 International Conference on Smart Computing.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[4]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[5]  Michael Goh Kah Ong,et al.  Facial Expression Recognition Using a Hybrid CNN-SIFT Aggregator , 2017, MIWAI.

[6]  Mark Sandler,et al.  MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Hung T. Nguyen,et al.  Classification of facial-emotion expression in the application of psychotherapy using Viola-Jones and Edge-Histogram of Oriented Gradient , 2016, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC).

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[10]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[11]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[12]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Skyler T. Hawk,et al.  Presentation and validation of the Radboud Faces Database , 2010 .

[14]  Rong Hu,et al.  The driver fatigue monitoring system based on face recognition technology , 2013, 2013 Fourth International Conference on Intelligent Control and Information Processing (ICICIP).

[15]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[16]  Shuzhi Sam Ge,et al.  Design and development of Nancy, a social robot , 2011, 2011 8th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI).

[17]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[18]  Wei Li,et al.  A deep-learning approach to facial expression recognition with candid images , 2015, 2015 14th IAPR International Conference on Machine Vision Applications (MVA).

[19]  Kannappan Palaniappan,et al.  Deep learning-based facial expression recognition for monitoring neurological disorders , 2017, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[20]  Stéphane Mallat,et al.  Rigid-Motion Scattering for Texture Classification , 2014, ArXiv.

[21]  Yang Li,et al.  Facial expression recognition based on LBP and SVM decision tree , 2015 .

[22]  Shan Li,et al.  Reliable Crowdsourcing and Deep Locality-Preserving Learning for Unconstrained Facial Expression Recognition , 2019, IEEE Transactions on Image Processing.

[23]  Anastasios Delopoulos,et al.  The MUG facial expression database , 2010, 11th International Workshop on Image Analysis for Multimedia Interactive Services WIAMIS 10.

[24]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).