Subject Independent Facial Expression Recognition: Cross-Connection and Spatial Pyramid Pooling Convolutional Neural Network

Facial expression recognition is still a problem at present, especially in the case of individual independence. On the one hand, due to the influence of morphological changes, ethnic differences and other factors, the expression of individual expressions varies greatly. On the other hand, there is currently no publicly available large-scale dataset that can support deep neural networks. To this end, this paper proposes cross-connection and spatial pyramid pooling convolutional neural network. The model not only uses spatial pyramid pooling for high-level feature enhancement, but also combines cross-connection and spatial pyramid pooling to extract important low-level features. Finally the different levels of features are connected to improve the generalization performance of the model. We validate our approach in four widely used public expression datasets (CK+, JAFFE, MMI, NimStim). Compared to other facial expression recognition methods, our proposed method achieves comparable or superior results. In the case of subject independence, the model achieved a good result with 97.41% accuracy on the CK+ dataset.

[1]  Maja Pantic,et al.  Web-based database for facial expression analysis , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[2]  Fernando De la Torre,et al.  Supervised Descent Method and Its Applications to Face Alignment , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  P. Ekman,et al.  Facial action coding system , 2019 .

[4]  Hongbin Zha,et al.  Modeling facial expression space for recognition , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Qiuqi Ruan,et al.  Facial expression recognition using sparse local Fisher discriminant analysis , 2016, Neurocomputing.

[6]  Inchul Song,et al.  Deep learning for real-time robust facial expression recognition on a smartphone , 2014, 2014 IEEE International Conference on Consumer Electronics (ICCE).

[7]  Tardi Tjahjadi,et al.  A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences , 2015, Pattern Recognit..

[8]  Mohammad H. Mahoor,et al.  Going deeper in facial expression recognition using deep neural networks , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[9]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[10]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  J. Tanaka,et al.  The NimStim set of facial expressions: Judgments from untrained research participants , 2009, Psychiatry Research.

[12]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[13]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[15]  Anil K. Jain,et al.  Handbook of Face Recognition, 2nd Edition , 2011 .

[16]  Tae-Sun Choi,et al.  Boosted NNE collections for multicultural facial expression recognition , 2016, Pattern Recognit..

[17]  Y. V. Venkatesh,et al.  Facial expression recognition using radial encoding of local Gabor features and classifier synthesis , 2012, Pattern Recognit..

[18]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[19]  Shiguang Shan,et al.  AU-inspired Deep Networks for Facial Expression Feature Learning , 2015, Neurocomputing.

[20]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[21]  Wenjun Zeng,et al.  Deeply-Fused Nets , 2016, ArXiv.

[22]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[23]  Joan Bruna,et al.  Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.

[24]  C. Darwin The Expression of the Emotions in Man and Animals , .

[25]  Ping Liu,et al.  Facial Expression Recognition via a Boosted Deep Belief Network , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Marcus Liwicki,et al.  DeXpression: Deep Convolutional Neural Network for Expression Recognition , 2015, ArXiv.

[27]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.