Expression Recognition Method Based on a Lightweight Convolutional Neural Network

Effective emotion recognition algorithms can help machines better understand people and promote the development of human-computer interaction applications. In recent years, many research efforts have used benchmark expression data to train deep neural network models to achieve state-of-art results. These high-accuracy models usually contain hundreds of layers, so they require complex calculations and may not be suitable for real-world scenarios. This paper proposes a lightweight emotion recognition (LER) model to handle the latency problem under natural conditions. The three main contributions of this paper are as follows. 1) The LER model incorporates a densely connected convolution layer and model compression techniques into a framework that eliminates redundancy parameters. 2) Multichannel input is introduced in our work to preprocess the image data, which improves the learning ability of the model. 3) Experiments show that the proposed LER model has better performance on the FER2013 and FERPLUS datasets compared with other lightweight models. Compared with the VGG13 used in previous work, the LER model achieves higher accuracy and reduces the number of parameters by 97 times. Finally, the FERFIN dataset is created, which had fewer noise data and more accurate labels than the FERPLUS dataset.

[1]  Jon Atli Benediktsson,et al.  Spatial Density Peak Clustering for Hyperspectral Image Classification With Noisy Labels , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[2]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[3]  Takeo Kanade,et al.  Recognizing Action Units for Facial Expression Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Lijun Yin,et al.  A high-resolution 3D dynamic facial expression database , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[5]  Sergio Escalera,et al.  Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Turgay Celik,et al.  FER‐Net: facial expression recognition using densely connected convolutional network , 2019, Electronics Letters.

[7]  Xiangyu Zhang,et al.  ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[8]  Séverine Dubuisson,et al.  Dynamic facial expression recognition by joint static and multi-time gap transition classification , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[9]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[10]  Stephen P. Boyd,et al.  A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights , 2014, J. Mach. Learn. Res..

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Raymond Chiong,et al.  Deep Learning for Human Affect Recognition: Insights and New Developments , 2019, IEEE Transactions on Affective Computing.

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Sergio Escalera,et al.  Survey on Emotional Body Gesture Recognition , 2018, IEEE Transactions on Affective Computing.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Timothy F. Cootes,et al.  Active Appearance Models , 1998, ECCV.

[17]  Kenji Suzuki,et al.  Overview of deep learning in medical imaging , 2017, Radiological Physics and Technology.

[18]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[19]  Maja Pantic,et al.  Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Maja Pantic,et al.  Automatic Analysis of Facial Actions: A Survey , 2019, IEEE Transactions on Affective Computing.

[21]  Vinod Chandran,et al.  Facial Expression Analysis under Partial Occlusion , 2018, ACM Comput. Surv..

[22]  Björn W. Schuller,et al.  LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework , 2013, Image Vis. Comput..

[23]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Geoffrey E. Hinton,et al.  On deep generative models with applications to recognition , 2011, CVPR 2011.

[25]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Tamás D. Gedeon,et al.  Emotion recognition using PHOG and LPQ features , 2011, Face and Gesture 2011.

[27]  Emad Barsoum,et al.  Training deep networks for facial expression recognition with crowd-sourced label distribution , 2016, ICMI.

[28]  Satnam Singh Dlay,et al.  Multi-gradient features and elongated quinary pattern encoding for image-based facial expression recognition , 2017, Pattern Recognit..

[29]  L. Rubin The Mechanism of Human Facial Expression , 1992 .

[30]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[31]  Gwen Littlewort,et al.  Computer Expression Recognition Toolbox , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[32]  Stefanos Zafeiriou,et al.  Local normal binary patterns for 3D facial action unit detection , 2012, 2012 19th IEEE International Conference on Image Processing.

[33]  Y. V. Venkatesh,et al.  Facial expression recognition using radial encoding of local Gabor features and classifier synthesis , 2012, Pattern Recognit..

[34]  Liming Chen,et al.  Author manuscript, published in "Workshop 3D Face Biometrics, IEEE Automatic Facial and Gesture Recognition, Shanghai: China (2013)" Fully Automatic 3D Facial Expression Recognition using Differential Mean Curvature Maps and Histograms of Oriented Gradien , 2013 .

[35]  Quoc V. Le,et al.  Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[36]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Loïc Kessous,et al.  Modeling naturalistic affective states via facial and vocal expressions recognition , 2006, ICMI '06.

[38]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[39]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[40]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[41]  Shan Li,et al.  Deep Facial Expression Recognition: A Survey , 2018, IEEE Transactions on Affective Computing.

[42]  Chengle Zhou,et al.  Hyperspectral anomaly detection via density peak clustering , 2020, Pattern Recognit. Lett..

[43]  Pascal Vincent,et al.  Disentangling Factors of Variation for Facial Expression Recognition , 2012, ECCV.

[44]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Yoshua Bengio,et al.  Challenges in representation learning: A report on three machine learning contests , 2013, Neural Networks.