An efficient unconstrained facial expression recognition algorithm based on Stack Binarized Auto-encoders and Binarized Neural Networks

Abstract Although deep learning has achieved good performances in many pattern recognition tasks, the over-fitting problem is still a serious issue for training deep networks containing large sets of parameters with limited labeled data. In this work, Binarized Auto-encoders (BAEs) and Stacked Binarized Auto-encoders (Stacked BAEs) are proposed to learn a kind of domain knowledge from a large-scale unlabeled facial dataset. By transferring the knowledge to another Binarized Neural Networks (BNNs) based supervised learning task with limited labeled data, the performance of the BNNs can be improved. A real-world facial expression recognition system is constructed by combining an unconstrained face normalization method, a variant of LBP descriptor, BAEs and BNNs. The experiment result shows that the whole system achieves good performance on the Static Facial Expressions in the Wild (SFEW) benchmark with minimal hardware requirements and lower memory and computation costs.

[1]  Dharmendra S. Modha,et al.  Backpropagation for Energy-Efficient Neuromorphic Computing , 2015, NIPS.

[2]  James Demmel,et al.  IEEE Standard for Floating-Point Arithmetic , 2008 .

[3]  P. Ekman,et al.  EMFACS-7: Emotional Facial Action Coding System , 1983 .

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ralph Gross,et al.  An Image Preprocessing Algorithm for Illumination Invariant Face Recognition , 2003, AVBPA.

[6]  Tamás D. Gedeon,et al.  Collecting Large, Richly Annotated Facial-Expression Databases from Movies , 2012, IEEE MultiMedia.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Changbo Hu,et al.  AAM derived face representations for robust facial action recognition , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[9]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[10]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Zhihong Zeng,et al.  A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  A. Mehrabian,et al.  Inference of attitudes from nonverbal communication in two channels. , 1967, Journal of consulting psychology.

[13]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[14]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Meng Joo Er,et al.  Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  Shaun J. Canavan,et al.  BP4D-Spontaneous: a high-resolution spontaneous 3D dynamic facial expression database , 2014, Image Vis. Comput..

[17]  Christopher Joseph Pal,et al.  Recurrent Neural Networks for Emotion Recognition in Video , 2015, ICMI.

[18]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[20]  Razvan Pascanu,et al.  Combining modality specific deep neural networks for emotion recognition in video , 2013, ICMI '13.

[21]  Gwen Littlewort,et al.  Toward Practical Smile Detection , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Tal Hassner,et al.  Emotion Recognition in the Wild via Convolutional Neural Networks and Mapped Binary Patterns , 2015, ICMI.

[23]  George A. Papakostas,et al.  Lattice Computing Extension of the FAM Neural Classifier for Human Facial Expression Recognition , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[24]  Y. Wang,et al.  Large-scale paralleled sparse principal component analysis , 2014, Multimedia Tools and Applications.

[25]  Tamás D. Gedeon,et al.  Emotion recognition using PHOG and LPQ features , 2011, Face and Gesture 2011.

[26]  Soo-Young Lee,et al.  Hierarchical Committee of Deep CNNs with Exponentially-Weighted Decision Fusion for Static Facial Expression Recognition , 2015, ICMI.

[27]  Tamás D. Gedeon,et al.  Static facial expression analysis in tough conditions: Data, evaluation protocol and benchmark , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[28]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Tamás D. Gedeon,et al.  Video and Image based Emotion Recognition Challenges in the Wild: EmotiW 2015 , 2015, ICMI.

[30]  Skyler T. Hawk,et al.  Presentation and validation of the Radboud Faces Database , 2010 .

[31]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[32]  Yoshua Bengio,et al.  BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.

[33]  Davis E. King,et al.  Dlib-ml: A Machine Learning Toolkit , 2009, J. Mach. Learn. Res..

[34]  Ron Meir,et al.  Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights , 2014, NIPS.

[35]  Philippos Mordohai,et al.  Automatic Facial Expression Recognition using Bags of Motion Words , 2010, BMVC.

[36]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Aleix M. Martínez,et al.  EmotioNet: An Accurate, Real-Time Algorithm for the Automatic Annotation of a Million Facial Expressions in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Tal Hassner,et al.  Effective face frontalization in unconstrained images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Roland Göcke,et al.  Facial Expression Based Automatic Album Creation , 2010, ICONIP.

[41]  M. Pantic,et al.  Faces InThe-Wild Challenge : Database and Results , 2016 .

[42]  Dacheng Tao,et al.  Non-Local Auto-Encoder With Collaborative Stabilization for Image Restoration , 2016, IEEE Transactions on Image Processing.

[43]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Michael Wagner,et al.  Evaluating AAM fitting methods for facial expression recognition , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[45]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Jane You,et al.  HSAE: A Hessian regularized sparse auto-encoders , 2016, Neurocomputing.

[47]  Takeo Kanade,et al.  The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[48]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.