SAF-BAGE: Salient Approach for Facial Soft-Biometric Classification - Age, Gender, and Facial Expression

How can we improve the facial soft-biometric classification with help of the human visual system? This paper explores the use of saliency which is equivalent to the human visual system to classify Age, Gender and Facial Expression soft-biometric for facial images. Using the Deep Multi-level Network (ML-Net) [1] and off-the-shelf face detector [2], we propose our approach - SAF-BAGE, which first detects the face in the test image, increases the Bounding Box (B-Box) margin by 30%, finds the saliency map using ML-Net, with 30% reweighted ratio of saliency map, it multiplies with the input cropped face and extracts the Convolutional Neural Networks (CNN) predictions on the multiplied reweighted salient face. Our CNN uses the model AlexNet [3], which is pre-trained on ImageNet. The proposed approach surpasses the performance of other approaches, increasing the state-of-the-art by approximately 0.8% on the widely-used Adience [28] dataset for Age and Gender classification and by nearly 3% on the recent AffectNet [36] dataset for Facial Expression classification. We hope our simple, reproducible and effective approach will help ease future research in facial soft-biometric classification using saliency.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Ali Borji,et al.  Boosting bottom-up and top-down visual features for saliency estimation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Buket D. Barkana,et al.  Deep Convolutional Neural Network for Age Estimation based on VGG-Face Model , 2017, ArXiv.

[4]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[5]  Junmo Kim,et al.  Joint Fine-Tuning in Deep Neural Networks for Facial Expression Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Tal Hassner,et al.  Effective face frontalization in unconstrained images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[8]  Huchuan Lu,et al.  Adaptive Metric Learning for Saliency Detection , 2015, IEEE Transactions on Image Processing.

[9]  Hatice Gunes,et al.  CNN-based Facial Affect Analysis on Mobile Devices , 2018, ArXiv.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Tal Hassner,et al.  Age and Gender Estimation of Unfiltered Faces , 2014, IEEE Transactions on Information Forensics and Security.

[12]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Nanning Zheng,et al.  Learning to Detect A Salient Object , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Carlos D. Castillo,et al.  An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[16]  Mohammad H. Mahoor,et al.  AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild , 2017, IEEE Transactions on Affective Computing.

[17]  Shiguang Shan,et al.  Adaptive Partial Differential Equation Learning for Visual Saliency Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Andreas Jönsson,et al.  Recognizing Spontaneous Facial Expressions using Deep Convolutional Neural Networks , 2018 .

[19]  Tal Hassner,et al.  Age and gender classification using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Vibhav Vineet,et al.  Efficient Salient Region Detection with Soft Image Abstraction , 2013, 2013 IEEE International Conference on Computer Vision.

[21]  Alexander Binder,et al.  Understanding and Comparing Deep Neural Networks for Age and Gender Classification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[22]  Terrence J. Sejnowski,et al.  SEXNET: A Neural Network Identifies Sex From Human Faces , 1990, NIPS.

[23]  S. Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, CVPR 2009.

[24]  Rita Cucchiara,et al.  A deep multi-level network for saliency prediction , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[25]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[26]  Luc Van Gool,et al.  Face Detection without Bells and Whistles , 2014, ECCV.

[27]  Sergio Escalera,et al.  Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History, Trends, and Affect-Related Applications , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[29]  Maja Pantic,et al.  Local Deep Neural Networks for Age and Gender Classification , 2017, ArXiv.

[30]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Michael J. Lyons,et al.  Coding facial expressions with Gabor wavelets , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[32]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[34]  Richard Socher,et al.  Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.

[35]  Qi Zhao,et al.  SALICON: Saliency in Context , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Afshin Dehghan,et al.  DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network , 2017, ArXiv.

[37]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[38]  Azar Fazel,et al.  Convolutional Neural Networks for Facial Expression Recognition , 2017, ArXiv.

[39]  Aren Jansen,et al.  CNN architectures for large-scale audio classification , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  Luc Van Gool,et al.  DEX: Deep EXpectation of Apparent Age from a Single Image , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).