Hierarchical Multi-Task Network For Race, Gender and Facial Attractiveness Recognition

Deep learning has powered many face related tasks and shown state-of-the-art performance. However, existing deep models are often trained separately for different problems, which results in heavy computational burden. To address this problem, we propose a novel multi-task network with fully convolutional architecture–Hierarchical Multi-task Network (HMT-Net), that simultaneously recognizes a person’s gender, race and facial attractiveness from a given portrait image. Aiming to improve the robustness to outliers in facial beauty prediction task, a novel loss is introduced into HMTNet. Compared to existing deep approaches, the proposed HMTNet achieves state-of-the-art performance on several datasets, and it can learn more discriminative feature representation through joint training and feature aggregation. Extensive experiments evidence the effectiveness of HMTNet.

[1]  D. Perrett,et al.  Facial shape and judgements of female attractiveness , 1994, Nature.

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Eytan Ruppin,et al.  Facial Attractiveness: Beauty and the Machine , 2006, Neural Computation.

[4]  Lianwen Jin,et al.  SCUT-FBP5500: A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[5]  Bo Li,et al.  Facial attractiveness computation by label distribution learning with deep CNN and geometric features , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[6]  Daniel Cohen-Or,et al.  A Humanlike Predictor of Facial Attractiveness , 2006, NIPS.

[7]  Jie Xu,et al.  Facial attractiveness prediction using psychologically inspired convolutional neural network (PI-CNN) , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Andrea Vedaldi,et al.  Learning Grimaces by Watching TV , 2016, BMVC.

[9]  David Zhang,et al.  Computer Models for Facial Beauty Analysis , 2016, Springer International Publishing.

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Tal Hassner,et al.  Age and gender classification using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[13]  Haibin Ling,et al.  SANet: Structure-Aware Network for Visual Tracking , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[14]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Anne Elorza Deias Face beauty analysis via manifold based semi-supervised learning , 2017 .

[16]  Xiaohui Yuan,et al.  CRNet: Classification and Regression Neural Network for Facial Beauty Prediction , 2018, PCM.

[17]  Haibin Ling,et al.  Parallel Tracking and Verifying: A Framework for Real-Time and High Accuracy Visual Tracking , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Yihong Gong,et al.  Predicting Facial Beauty without Landmarks , 2010, ECCV.

[20]  Ashok Samal,et al.  A landmark-based data-driven approach on 2.5D facial attractiveness computation , 2017, Neurocomputing.

[21]  Carlos D. Castillo,et al.  An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[22]  Haibin Ling,et al.  Multi-Level Contextual RNNs With Attention Model for Scene Labeling , 2016, IEEE Transactions on Intelligent Transportation Systems.

[23]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[25]  Luc Van Gool,et al.  Some Like It Hot — Visual Guidance for Preference Prediction , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Jie Xu,et al.  SCUT-FBP: A Benchmark Dataset for Facial Beauty Perception , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[27]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.