Fusing multi-stream deep neural networks for facial expression recognition

Among the factors contributing to conveying emotional state of an individual is facial expression. It represents the most important nonverbal communication and a challenging task in the field of computer vision. In this work, we propose a combined deep architecture model for facial expression recognition that uses appearance and geometric features extracted separately using convolution layers and supervised decent method, respectively. The proposed model is trained on three public databases [the Extended Cohn Kanade (CK+), the OULU-CASIA VIS, and the JAFFE]. The three databases contain a limited amount of data that we enlarge by adding a step of data augmentation to original images. For further comparison, two additional models that use appearance features only and geometric features only are trained on the same subset of data, to show how the combination of the two deep architectures influences results. On the other hand, in order to investigate the generalization of the combined model, a cross-database evaluation is performed. The obtained results achieve the state-of-the-art and improve recent work, especially in case of cross-database evaluation.

[1]  Yoshua Bengio,et al.  Object Recognition with Gradient-Based Learning , 1999, Shape, Contour and Grouping in Computer Vision.

[2]  Hasan Demirel,et al.  Low-rank sparse coding and region of interest pooling for dynamic 3D facial expression recognition , 2018, Signal Image Video Process..

[3]  Meng Wang,et al.  An adaptive weighted fusion model with two subspaces for facial expression recognition , 2018, Signal Image Video Process..

[4]  Alaa Eleyan,et al.  Facial expression recognition based on image pyramid and single-branch decision tree , 2017, Signal, Image and Video Processing.

[5]  Ping Liu,et al.  Facial Expression Recognition via a Boosted Deep Belief Network , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Xiong Chen,et al.  Facial expression recognition from image sequences using twofold random forest classifier , 2015, Neurocomputing.

[7]  Edilson de Aguiar,et al.  Facial expression recognition with Convolutional Neural Networks: Coping with few data and the training sample order , 2017, Pattern Recognit..

[8]  Ana Belén Moreno,et al.  Differential optical flow applied to automatic facial expression recognition , 2011, Neurocomputing.

[9]  Teng Li,et al.  Facial Expression Recognition with Faster R-CNN , 2017 .

[10]  Rajendran Parthiban,et al.  Joint facial expression recognition and intensity estimation based on weighted votes of image sequences , 2017, Pattern Recognit. Lett..

[11]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[12]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[13]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[14]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[15]  Hasan Demirel,et al.  Entropy-based feature selection for improved 3D facial expression recognition , 2013, Signal, Image and Video Processing.

[16]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  A. Mehrabian Communication without words , 1968 .

[18]  Tardi Tjahjadi,et al.  A dynamic framework based on local Zernike moment and motion history image for facial expression recognition , 2017, Pattern Recognit..

[19]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[20]  Hassan Aghaeinia,et al.  Incorporating prior knowledge from the new person into recognition of facial expression , 2016, Signal Image Video Process..

[21]  P. Ekman An argument for basic emotions , 1992 .

[22]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[23]  Matti Pietikäinen,et al.  Facial expression recognition from near-infrared videos , 2011, Image Vis. Comput..

[24]  Qirong Mao,et al.  Hierarchical Bayesian Theme Models for Multipose Facial Expression Recognition , 2017, IEEE Transactions on Multimedia.

[25]  Abdellah Madani,et al.  Facial Expression Recognition Using Decision Trees , 2016, 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV).

[26]  Cláudio Rosito Jung,et al.  Facial expression recognition using temporal POEM features , 2017, Pattern Recognit. Lett..

[27]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Asit Barman,et al.  Facial expression recognition using distance and shape signature features , 2017, Pattern Recognit. Lett..

[29]  Uros Mlakar,et al.  Automated facial expression recognition based on histograms of oriented gradient feature vector differences , 2015, Signal Image Video Process..

[30]  R. Rajaram,et al.  Generating Best Features for Web Page Classification , 2008, Webology.

[31]  Aurobinda Routray,et al.  Automatic facial expression recognition using features of salient facial patches , 2015, IEEE Transactions on Affective Computing.