Facial expression recognition with Convolutional Neural Networks: Coping with few data and the training sample order

Facial expression recognition has been an active research area in the past 10 years, with growing application areas including avatar animation, neuromarketing and sociable robots. The recognition of facial expressions is not an easy problem for machine learning methods, since people can vary significantly in the way they show their expressions. Even images of the same person in the same facial expression can vary in brightness, background and pose, and these variations are emphasized if considering different subjects (because of variations in shape, ethnicity among others). Although facial expression recognition is very studied in the literature, few works perform fair evaluation avoiding mixing subjects while training and testing the proposed algorithms. Hence, facial expression recognition is still a challenging problem in computer vision. In this work, we propose a simple solution for facial expression recognition that uses a combination of Convolutional Neural Network and specific image pre-processing steps. Convolutional Neural Networks achieve better accuracy with big data. However, there are no publicly available datasets with sufficient data for facial expression recognition with deep architectures. Therefore, to tackle the problem, we apply some pre-processing techniques to extract only expression specific features from a face image and explore the presentation order of the samples during training. The experiments employed to evaluate our technique were carried out using three largely used public databases (CK+, JAFFE and BU-3DFE). A study of the impact of each image pre-processing operation in the accuracy rate is presented. The proposed method: achieves competitive results when compared with other facial expression recognition methods 96.76% of accuracy in the CK+ database it is fast to train, and it allows for real time facial expression recognition with standard computers. HighlightsA CNN based approach for facial expression recognition.A set of pre-processing steps allowing for a simpler CNN architecture.A study of the impact of each pre-processing step in the accuracy.A study for lowering the impact of the sample presentation order during training.High facial expression recognition accuracy (96.76%) with real time evaluation.

[1]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[2]  Jun Wang,et al.  A 3D facial expression database for facial behavior research , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[3]  Takeo Kanade,et al.  Comprehensive database for facial expression analysis , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[4]  Qingshan Liu,et al.  Learning active facial patches for expression analysis , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Catalin-Daniel Caleanu Face expression recognition: A brief overview of the last decade , 2013, 2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI).

[6]  Weifeng Liu,et al.  Facial expression recognition based on discriminative dictionary learning , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[7]  Azriel Rosenfeld,et al.  Progress in pattern recognition - Volume 2 , 1981 .

[8]  C. Darwin The Expression of the Emotions in Man and Animals , .

[9]  Sangyoun Lee,et al.  Automatic head pose estimation from a single camera using projective geometry , 2011, 2011 8th International Conference on Information, Communications & Signal Processing.

[10]  Sungyoung Lee,et al.  Human Facial Expression Recognition Using Stepwise Linear Discriminant Analysis and Hidden Conditional Random Fields , 2015, IEEE Transactions on Image Processing.

[11]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Gwen Littlewort,et al.  Recognizing facial expression: machine learning and application to spontaneous behavior , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Chun Chen,et al.  Sparse Coding for Flexible, Robust 3D Facial-Expression Synthesis , 2012, IEEE Computer Graphics and Applications.

[14]  Phill-Kyu Rhee,et al.  Face and eye location algorithms for visual user interface , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[15]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[16]  Antoinette M. Feleky The expression of the emotions. , 1914 .

[17]  B. Wandell Foundations of vision , 1995 .

[18]  Yu Zhou,et al.  Application of Mean Shift Algorithm in Real-Time Facial Expression Recognition , 2009, 2009 International Symposium on Computer Network and Multimedia Technology.

[19]  Edilson de Aguiar,et al.  A Facial Expression Recognition System Using Convolutional Networks , 2015, 2015 28th SIBGRAPI Conference on Graphics, Patterns and Images.

[20]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[21]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[22]  Stan Z. Li,et al.  Regularized Transfer Boosting for Face Detection Across Spectrum , 2012, IEEE Signal Processing Letters.

[23]  Simon Lucey,et al.  Deformable Model Fitting by Regularized Landmark Mean-Shift , 2010, International Journal of Computer Vision.

[24]  Martin D. Levine,et al.  Fully automated recognition of spontaneous facial expressions in videos using random forest classifiers , 2014, IEEE Transactions on Affective Computing.

[25]  Jake K. Aggarwal,et al.  Facial expression recognition with temporal modeling of shapes , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[26]  Yong Man Ro,et al.  Collaborative expression representation using peak expression and intra class variation face images for practical subject-independent emotion recognition in videos , 2016, Pattern Recognit..

[27]  Stefanos Zafeiriou,et al.  Real-time generic face tracking in the wild with CUDA , 2014, MMSys '14.

[28]  Stefanos Zafeiriou,et al.  Robust Discriminative Response Map Fitting with Constrained Local Models , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Chokri Ben Amar,et al.  Facial expression recognition based on a mlp neural network using constructive training algorithm , 2014, Multimedia Tools and Applications.

[30]  Ching-Te Chiu,et al.  A 0.64 mm$^{2}$ Real-Time Cascade Face Detection Design Based on Reduced Two-Field Extraction , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[31]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[32]  Patrice Y. Simard,et al.  Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[33]  Pascal Vincent,et al.  Disentangling Factors of Variation for Facial Expression Recognition , 2012, ECCV.

[34]  Beat Fasel,et al.  Robust face analysis using convolutional neural networks , 2002, Object recognition supported by user interaction for service robots.

[35]  Oksam Chae,et al.  Local Directional Number Pattern for Face Analysis: Face and Expression Recognition , 2013, IEEE Transactions on Image Processing.

[36]  Doina Precup,et al.  Multi-layer temporal graphical model for head pose estimation in real-world videos , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[37]  Shiguang Shan,et al.  AU-inspired Deep Networks for Facial Expression Feature Learning , 2015, Neurocomputing.

[38]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[39]  Kishor M. Bhurchandi,et al.  Expression invariant face recognition using local binary patterns and contourlet transform , 2016 .

[40]  Simon Lucey,et al.  How much training data for facial action unit detection? , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[41]  Lin Ma,et al.  Multimodal learning for facial expression recognition , 2015, Pattern Recognit..

[42]  Luiz Eduardo Soares de Oliveira,et al.  Fusion of feature sets and classifiers for facial expression recognition , 2013, Expert Syst. Appl..

[43]  Hélio Pedrini,et al.  Facial Expression Recognition with Occlusions Based on Geometric Representation , 2015, CIARP.

[44]  Shaogang Gong,et al.  Facial expression recognition based on Local Binary Patterns: A comprehensive study , 2009, Image Vis. Comput..

[45]  Keun-Chang Kwak,et al.  Facial Expression Recognition Using 3D Convolutional Neural Network , 2014 .

[46]  Beat Fasel,et al.  Head-pose invariant facial expression recognition using convolutional neural networks , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[47]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[48]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[49]  GarciaChristophe,et al.  Convolutional Face Finder , 2004 .

[50]  Qingshan Liu,et al.  Boosting Coded Dynamic Features for Facial Action Units and Facial Expression Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Ling Li,et al.  Fully automatic 3D facial expression recognition using local depth features , 2014, IEEE Winter Conference on Applications of Computer Vision.

[52]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[53]  Yunhui Liu,et al.  An Efficient Face Normalization Algorithm Based on Eyes Detection , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[54]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[55]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[56]  Qiuqi Ruan,et al.  Facial expression recognition using sparse local Fisher discriminant analysis , 2016, Neurocomputing.

[57]  Inchul Song,et al.  Deep learning for real-time robust facial expression recognition on a smartphone , 2014, 2014 IEEE International Conference on Consumer Electronics (ICCE).

[58]  Takeo Kanade,et al.  Detection, tracking, and classification of action units in facial expression , 2000, Robotics Auton. Syst..

[59]  Tae-Sun Choi,et al.  Boosted NNE collections for multicultural facial expression recognition , 2016, Pattern Recognit..

[60]  Y. V. Venkatesh,et al.  Facial expression recognition using radial encoding of local Gabor features and classifier synthesis , 2012, Pattern Recognit..

[61]  Stefano Berretti,et al.  Shape analysis of local facial patches for 3D facial expression recognition , 2011, Pattern Recognit..

[62]  Eun-Soo Kim,et al.  Human facial expression recognition using curvelet feature extraction and normalized mutual information feature selection , 2014, Multimedia Tools and Applications.

[63]  Javaid Iqbal,et al.  Face recognition with expression variation via robust NCC , 2013, 2013 IEEE 9th International Conference on Emerging Technologies (ICET).

[64]  Chun Chen,et al.  Feature level analysis for 3D facial expression recognition , 2011, Neurocomputing.

[65]  P. Ekman,et al.  Facial action coding system: a technique for the measurement of facial movement , 1978 .

[66]  Eun-Soo Kim,et al.  Facial expression recognition using active contour-based face detection, facial movement-based feature extraction, and non-linear feature selection , 2014, Multimedia Systems.

[67]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[68]  Luca Maria Gambardella,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Flexible, High Performance Convolutional Neural Networks for Image Classification , 2022 .

[69]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[70]  Michael J. Lyons,et al.  Automatic Classification of Single Facial Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[71]  S. Arivazhagan,et al.  Face recognition based on local directional number pattern and ANFIS classifier , 2014, 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies.

[72]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[73]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[74]  Anil K. Jain,et al.  Handbook of Face Recognition, 2nd Edition , 2011 .

[75]  Zhengyou Zhang,et al.  Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[76]  Domingo Mery,et al.  Automatic facial attribute analysis via adaptive sparse representation of random patches , 2015, Pattern Recognit. Lett..

[77]  Ping Liu,et al.  Facial Expression Recognition via a Boosted Deep Belief Network , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[78]  M. Pietikäinen,et al.  Facial expression recognition based on local binary patterns , 2007, Pattern Recognition and Image Analysis.

[79]  Paul E. Utgoff,et al.  Many-Layered Learning , 2002, Neural Computation.

[80]  Tardi Tjahjadi,et al.  A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences , 2015, Pattern Recognit..

[81]  Kin-Man Lam,et al.  Region-based feature fusion for facial-expression recognition , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[82]  James Bailey,et al.  Invariant backpropagation: how to train a transformation-invariant neural network , 2015, ArXiv.

[83]  Masakazu Matsugu,et al.  Subject independent facial expression recognition with robust face detection using a convolutional neural network , 2003, Neural Networks.

[84]  Peng Liu,et al.  3D Head Pose Estimation Based on Scene Flow and Generic Head Model , 2012, 2012 IEEE International Conference on Multimedia and Expo.

[85]  Hongbin Zha,et al.  Modeling facial expression space for recognition , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.