DEX: Deep EXpectation of Apparent Age from a Single Image

In this paper we tackle the estimation of apparent age in still face images with deep learning. Our convolutional neural networks (CNNs) use the VGG-16 architecture [13] and are pretrained on ImageNet for image classification. In addition, due to the limited number of apparent age annotated images, we explore the benefit of finetuning over crawled Internet face images with available age. We crawled 0.5 million images of celebrities from IMDB and Wikipedia that we make public. This is the largest public dataset for age prediction to date. We pose the age regression problem as a deep classification problem followed by a softmax expected value refinement and show improvements over direct regression training of CNNs. Our proposed method, Deep EXpectation (DEX) of apparent age, first detects the face in the test image and then extracts the CNN predictions from an ensemble of 20 networks on the cropped face. The CNNs of DEX were finetuned on the crawled images and then on the provided images with apparent age annotations. DEX does not use explicit facial landmarks. Our DEX is the winner (1st place) of the ChaLearn LAP 2015 challenge on apparent age estimation with 115 registered teams, significantly outperforming the human reference.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Karl Ricanek,et al.  MORPH: a longitudinal image database of normal adult age-progression , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[3]  Tal Hassner,et al.  Age and Gender Estimation of Unfiltered Faces , 2014, IEEE Transactions on Information Forensics and Security.

[4]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[5]  Luc Van Gool,et al.  Face Detection without Bells and Whistles , 2014, ECCV.

[6]  Chu-Song Chen,et al.  Cross-Age Reference Coding for Age-Invariant Face Recognition and Retrieval , 2014, ECCV.

[7]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Anil K. Jain,et al.  Age estimation from face images: Human vs. machine performance , 2013, 2013 International Conference on Biometrics (ICB).

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Sergio Escalera,et al.  ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition Datasets and Results , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[11]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[12]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.