Analyzing ConvNets Depth for Deep Face Recognition

Deep convolutional neural networks are becoming increasingly popular in large-scale image recognition, classification, localization, and detection. In this paper, the performance of state-of-the-art convolution neural networks (ConvNets) models of the ImageNet challenge (ILSVRC), namely VGG16, VGG19, OverFeat, ResNet50, and Inception-v3 which achieved top-5 error rates up to 4.2% are analyzed in the context of face recognition. Instead of using handcrafted feature extraction techniques which requires a domain-level understanding, ConvNets have the advantages of automatically learning complex features, more training time, and less evaluation time. These models are benchmarked on AR and Extended Yale B face dataset with five performance metrics, namely Precision, Recall, F1-score, Rank-1 accuracy, and Rank-5 accuracy. It is found that GoogleNet ConvNets model with Inception-v3 architecture outperforms than other four architectures with a Rank-1 accuracy of 98.46% on AR face dataset and 97.94% accuracy on Extended Yale B face dataset. It confirms that deep CNN architectures are suitable for real-time face recognition in the future.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Jian Sun,et al.  Face recognition with learning-based descriptor , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Aleix M. Martinez,et al.  The AR face database , 1998 .

[5]  Peter N. Belhumeur,et al.  Tom-vs-Pete Classifiers and Identity-Preserving Alignment for Face Verification , 2012, BMVC.

[6]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[7]  Umar Mohammed,et al.  Probabilistic Models for Inference about Identity , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Rich Caruana,et al.  Do Deep Nets Really Need to be Deep? , 2013, NIPS.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[13]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.