VEGAC: Visual Saliency-based Age, Gender, and Facial Expression Classification Using Convolutional Neural Networks

This paper explores the use of Visual Saliency to Classify Age, Gender and Facial Expression for Facial Images. For multi-task classification, we propose our method VEGAC, which is based on Visual Saliency. Using the Deep Multi-level Network [1] and off-the-shelf face detector [2], our proposed method first detects the face in the test image and extracts the CNN predictions on the cropped face. The CNN of VEGAC were fine-tuned on the collected dataset from different benchmarks. Our convolutional neural network (CNN) uses the VGG-16 architecture [3] and is pre-trained on ImageNet for image classification. We demonstrate the usefulness of our method for Age Estimation, Gender Classification, and Facial Expression Classification. We show that we obtain the competitive result with our method on selected benchmarks. All our models and code will be publically available.

[1]  Sergio Escalera,et al.  Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Shanmuganathan Raman,et al.  Facial Expression Recognition Using Visual Saliency and Deep Learning , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[4]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[7]  Ayesha Gurnani,et al.  Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[8]  Guoshan Zhang,et al.  A Saliency Based Human Detection Framework for Infrared Thermal Images , 2017, CCCV.

[9]  Niels da Vitoria Lobo,et al.  Age classification from facial images , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Yibin Li,et al.  Facial Expression Recognition with Fusion Features Extracted from Salient Facial Areas , 2017, Sensors.

[11]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Shiguang Shan,et al.  Deeply Learning Deformable Facial Action Parts Model for Dynamic Expression Analysis , 2014, ACCV.

[14]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[15]  Mahmoud Afifi,et al.  AFIF4: Deep Gender Classification based on AdaBoost-based Fusion of Isolated Facial Features and Foggy Faces , 2017, J. Vis. Commun. Image Represent..

[16]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[17]  Vibhav Vineet,et al.  Efficient Salient Region Detection with Soft Image Abstraction , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Alexander Binder,et al.  Understanding and Comparing Deep Neural Networks for Age and Gender Classification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[19]  Ayesha Gurnani,et al.  Using Visual Saliency to Improve Human Detection with Convolutional Networks , 2018, ArXiv.

[20]  Rita Cucchiara,et al.  A deep multi-level network for saliency prediction , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[21]  Dong Yu,et al.  Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP] , 2011, IEEE Signal Processing Magazine.

[22]  Tal Hassner,et al.  Age and Gender Estimation of Unfiltered Faces , 2014, IEEE Transactions on Information Forensics and Security.

[23]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Qi Zhao,et al.  SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[25]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[26]  Shiguang Shan,et al.  Adaptive Partial Differential Equation Learning for Visual Saliency Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Tal Hassner,et al.  Age and gender classification using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[28]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Shiguang Shan,et al.  Learning Expressionlets on Spatio-temporal Manifold for Dynamic Facial Expression Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Terrence J. Sejnowski,et al.  SEXNET: A Neural Network Identifies Sex From Human Faces , 1990, NIPS.

[31]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[33]  Luc Van Gool,et al.  Face Detection without Bells and Whistles , 2014, ECCV.

[34]  Nello Cristianini,et al.  Gender Classification by Deep Learning on Millions of Weakly Labelled Images , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[35]  Huchuan Lu,et al.  Adaptive Metric Learning for Saliency Detection , 2015, IEEE Transactions on Image Processing.

[36]  Yu Qiao,et al.  Gender and Smile Classification Using Deep Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[37]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Carlos D. Castillo,et al.  An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[39]  Xiaogang Wang,et al.  Person Re-Identification by Saliency Learning , 2014 .

[40]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[41]  Claudio A. Perez,et al.  Gender Classification Based on Fusion of Different Spatial Scale Features Selected by Mutual Information From Histogram of LBP, Intensity, and Shape , 2013, IEEE Transactions on Information Forensics and Security.

[42]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[43]  Junmo Kim,et al.  Deep generative-contrastive networks for facial expression recognition , 2017, ArXiv.

[44]  Afshin Dehghan,et al.  DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network , 2017, ArXiv.

[45]  Azar Fazel,et al.  Convolutional Neural Networks for Facial Expression Recognition , 2017, ArXiv.

[46]  Cordelia Schmid,et al.  A Spatio-Temporal Descriptor Based on 3D-Gradients , 2008, BMVC.

[47]  Cha Zhang,et al.  Image based Static Facial Expression Recognition with Multiple Deep Network Learning , 2015, ICMI.

[48]  Changsheng Xu,et al.  Robust gender classification on unconstrained face images , 2015, ICIMCS '15.

[49]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[50]  Niels da Vitoria Lobo,et al.  Age Classification from Facial Images , 1999, Comput. Vis. Image Underst..

[51]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[52]  Aurobinda Routray,et al.  Automatic facial expression recognition using features of salient facial patches , 2015, IEEE Transactions on Affective Computing.

[53]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[54]  Hubert Konik,et al.  Saliency-based framework for facial expression recognition , 2018, Frontiers of Computer Science.

[55]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[57]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[58]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[59]  Tal Hassner,et al.  Effective face frontalization in unconstrained images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Caifeng Shan,et al.  Learning local binary patterns for gender classification on real-world face images , 2012, Pattern Recognit. Lett..

[61]  Hamid Krim,et al.  AOGNets: Deep AND-OR Grammar Networks for Visual Recognition , 2017, ArXiv.

[62]  Nuno Vasconcelos,et al.  Learning Optimal Seeds for Diffusion-Based Salient Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[63]  Buket D. Barkana,et al.  Deep Convolutional Neural Network for Age Estimation based on VGG-Face Model , 2017, ArXiv.

[64]  William Lopez,et al.  Cascade Classifiers and Saliency Maps Based People Detection , 2017, AVR.

[65]  Yu Yan,et al.  An Unconstrained Face Detection Algorithm Based on Visual Saliency , 2017, EIDWT.

[66]  Luc Van Gool,et al.  DEX: Deep EXpectation of Apparent Age from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[67]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[68]  Maja Pantic,et al.  Local Deep Neural Networks for Age and Gender Classification , 2017, ArXiv.

[69]  Weria Khaksar,et al.  Facial Expression Recognition Using Salient Features and Convolutional Neural Network , 2017, IEEE Access.

[70]  Kamal Nasrollahi,et al.  Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification , 2017, IEEE Transactions on Cybernetics.

[71]  Matti Pietikäinen,et al.  Dynamic Texture Recognition Using Local Binary Patterns with an Application to Facial Expressions , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Garrison W. Cottrell,et al.  Representing Face Images for Emotion Classification , 1996, NIPS.

[73]  Ali Borji,et al.  Boosting bottom-up and top-down visual features for saliency estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[74]  Rajat Raina,et al.  Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.

[75]  Vandit Gajjar,et al.  2^B3^C: 2 Box 3 Crop of Facial Image for Gender Classification with Convolutional Networks , 2018, ArXiv.