Learning to classify gender from four million images

We automatically assemble a big dataset to train a face gender classifier.This is formed by 4 million images and over 60,000 features.The resulting system significantly outperforms the previous state of the art without human annotation.This study lends support to the "unreasonable effectiveness of data" conjecture.This study is relevant to computer vision (LBP features, face classification), machine learning (large scale linear classifiers), and big data.This study can serve as a template for other "web scale" learning tasks. The application of learning algorithms to big datasets has been identified for a long time as an effective way to attack important tasks in pattern recognition, but the generation of large annotated datasets has a significant cost. We present a simple and effective method to generate a classifier of face images, by training a linear classification algorithm on a massive dataset entirely assembled and labelled by automated means. In doing so, we perform the largest experiment on face gender recognition so far published, reporting the highest performance yet. Four million images and more than 60,000 features are used to train online classifiers. By using an ensemble of linear classifiers, we achieve an accuracy of 96.86% on the most challenging public database, labelled faces in the wild (LFW), 2.05% higher than the previous best result on the same dataset (Shan, 2012). This result is relevant both for the machine learning community, addressing the role of large datasets, and the computer vision community, providing a way to make high quality face gender classifiers. Furthermore, we propose a general way to generate and exploit massive data without human annotation. Finally, we demonstrate a simple and effective adaptation of the Pegasos that makes it more robust.

[1]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  José Miguel Buenaposada,et al.  Revisiting Linear Discriminant Techniques in Gender Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[5]  Ming-Hsuan Yang,et al.  Learning Gender with Support Faces , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Shumeet Baluja,et al.  Boosting Sex Identification Performance , 2005, International Journal of Computer Vision.

[7]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[8]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9]  Paul A. Viola,et al.  A unified learning framework for real time face detection and classification , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[10]  Andrew Zisserman,et al.  Hello! My name is... Buffy'' -- Automatic Naming of Characters in TV Video , 2006, BMVC.

[11]  Roope Raisamo,et al.  Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Haizhou Ai,et al.  Real-time gender classification , 2003, International Symposium on Multispectral Image Processing and Pattern Recognition.

[13]  Karl Ricanek,et al.  MORPH: a longitudinal image database of normal adult age-progression , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[14]  Chi-Ho Chan Multi-scale local Binary Pattern Histogram for Face Recognition , 2007, ICB.

[15]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[16]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[17]  Ming Li,et al.  An Experimental Study on Automatic Face Gender Classification , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[18]  Javier Lorenzo-Navarro,et al.  Gender Classification in Large Databases , 2012, CIARP.

[19]  Daniel González-Jiménez,et al.  Single- and cross- database benchmarks for gender classification under unconstrained settings , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[20]  Trevor Darrell,et al.  PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Tsuhan Chen,et al.  Understanding images of groups of people , 2009, CVPR.

[22]  Caifeng Shan,et al.  Learning local binary patterns for gender classification on real-world face images , 2012, Pattern Recognit. Lett..

[23]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[25]  Tal Hassner,et al.  Similarity Scores Based on Background Samples , 2009, ACCV.