Benchmarking parts based face processing in-the-wild for gender recognition and head pose estimation

Abstract Multiple in-the-wild face processing approaches suffer from degraded performance due to, among other factors, facial expressions, occlusions, accessories, and variations in lighting and in the head pose. To better understand these issues, four different face parts, each eye, the nose, and the mouth, are studied for performing face processing tasks in challenging environments. Additionally, an automatic pipeline based on convolutional neural networks is proposed for detecting the available regions, processing them, and combining the results generated from each, resulting in a robust solution. The pipeline is evaluated on two common face processing tasks: head pose estimation, and gender recognition. Experiments are performed using two different object detectors, five popular, and one custom convolutional network architecture, for the classification step, and two datasets, one for each task, with different overall difficulty, representing a wide range in the unconstrained scenario spectrum. Results are detailed for each region and their combination, comparisons are performed against the state-of-the-art and in-depth discussions are provided. In particular, experiments indicate that the nose and the mouth play a major role in challenging scenarios, due to their robustness to self occlusion. The complete pipeline outperforms state-of-the-art works when estimating the head pose in unconstrained scenarios, and achieves competitive performance for recognizing gender. By evaluating each region separately, degraded parts are excluded from processing, favoring the use of reliable face information, resulting in increased performance.

[1]  Radu Horaud,et al.  Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions , 2016, IEEE Transactions on Image Processing.

[2]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[3]  Damon L. Woodard,et al.  Head pose estimation in the wild using approximate view manifolds , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Andrew W. Fitzgibbon,et al.  An Experimental Comparison of Range Image Segmentation Algorithms , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Maurício Pamplona Segundo,et al.  3D Face Recognition Using Simulated Annealing and the Surface Interpenetration Measure , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Rama Chellappa,et al.  HyperFace: A Deep Multi-Task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Jeffrey F. Cohn,et al.  Automatic Measurement of Head and Facial Movement for Analysis and Detection of Infants’ Positive and Negative Affect , 2015, Front. ICT.

[8]  Wuming Zhang,et al.  3D assisted face recognition via progressive pose estimation , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[9]  Marios Savvides,et al.  Investigating the feasibility of image-based nose biometrics , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[10]  Tal Hassner,et al.  Viewing Real-World Faces in 3D , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Gang Hua,et al.  Hierarchical-PEP model for real-world face recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Junzhou Huang,et al.  Three-dimensional head pose estimation in-the-wild , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[15]  Takeo Kanade,et al.  Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[16]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Tal Hassner,et al.  Age and gender classification using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  James J. Filliben,et al.  Generalizing face quality and factor measures to video , 2014, IEEE International Joint Conference on Biometrics.

[20]  Patrick J. Flynn,et al.  Multiple Nose Region Matching for 3D Face Recognition under Varying Facial Expression , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Driss Aboutajdine,et al.  Combining Facial Parts For Learning Gender, Ethnicity, and Emotional State Based on RGB-D Information , 2018, ACM Trans. Multim. Comput. Commun. Appl..

[22]  Luciano Silva,et al.  3D Face Alignment in the Wild: A Landmark-Free, Nose-Based Approach , 2016, ECCV Workshops.

[23]  Roberto Paredes,et al.  Local Deep Neural Networks for gender recognition , 2016, Pattern Recognit. Lett..

[24]  Adrian N. Evans,et al.  Using nasal curves matching for expression robust 3D nose recognition , 2013, 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS).

[25]  Olga R. P. Bellon,et al.  AUMPNet: Simultaneous Action Units Detection and Intensity Estimation on Multipose Facial Images Using a Single Convolutional Neural Network , 2017, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[26]  Tal Hassner,et al.  Effective face frontalization in unconstrained images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Horst Bischof,et al.  Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[30]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Olga R. P. Bellon,et al.  3D Face Reconstruction Using a Single or Multiple Views , 2010, 2010 20th International Conference on Pattern Recognition.

[33]  Carlos D. Castillo,et al.  An All-In-One Convolutional Neural Network for Face Analysis , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[34]  Angelo Cangelosi,et al.  Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods , 2017, Pattern Recognit..

[35]  Alan C. Bovik,et al.  Anthropometric 3D Face Recognition , 2010, International Journal of Computer Vision.