Residual vs. Inception vs. Classical Networks for Low-Resolution Face Recognition

When analyzing surveillance footage, low-resolution face recognition is still a challenging task. While high-resolution face recognition experienced impressive improvements by Convolutional Neural Network (CNN) approaches, the benefit to low-resolution face recognition remains unclear as only few work has been done in this area. This paper adapts three popular high-resolution CNN designs to the low-resolution (LR) domain to find the most suitable architecture. Namely, the classical AlexNet/VGG architecture, Google’s inception architecture and Microsoft’s residual architecture are considered. While the inception and residual concept have been proven to be useful for very deep networks, it is shown in our case that shallower networks than for high-resolution recognition are sufficient. This leads to an advantage of the classical network architecture. Final evaluation on a downscaled version of the public YouTube Faces Database indicates a comparable performance to the high-resolution domain. Results with faces extracted from the SoBiS surveillance dataset indicate a superior performance of the trained networks in the LR domain.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jürgen Beyerer,et al.  Low-resolution Convolutional Neural Networks for video face recognition , 2016, 2016 13th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[3]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Rama Chellappa,et al.  Synthesis-based Robust Low Resolution Face Recognition , 2017, ArXiv.

[5]  Arne Schumann,et al.  A soft-biometrics dataset for person tracking and re-identification , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[6]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[7]  Jürgen Beyerer,et al.  Low-Quality Video Face Recognition with Deep Networks and Polygonal Chain Distance , 2016, 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[8]  Jean-Philippe Thiran,et al.  Towards robust cascaded regression for face alignment in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ruimin Hu,et al.  CDMMA: Coupled discriminant multi-manifold analysis for matching low-resolution face images , 2016, Signal Process..

[11]  Tal Hassner,et al.  Do We Really Need to Collect Millions of Faces for Effective Face Recognition? , 2016, ECCV.

[12]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[13]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Xiao Zhang,et al.  Finding Celebrities in Billions of Web Images , 2012, IEEE Transactions on Multimedia.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[17]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Shuicheng Yan,et al.  Toward Large-Population Face Identification in Unconstrained Videos , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[21]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[22]  Dacheng Tao,et al.  Trunk-Branch Ensemble Convolutional Neural Networks for Video-Based Face Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Tal Hassner,et al.  Face recognition in unconstrained videos with matched background similarity , 2011, CVPR 2011.

[24]  Stefan Winkler,et al.  A data-driven approach to cleaning large face datasets , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[25]  S. Shan,et al.  VIPLFaceNet: an open source deep face recognition SDK , 2016, Frontiers of Computer Science.

[26]  Andrew Zisserman,et al.  Fisher Vector Faces in the Wild , 2013, BMVC.

[27]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Sivaram Prasad Mudunuri,et al.  Low Resolution Face Recognition Across Variations in Pose and Illumination , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Yuxiao Hu,et al.  MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World , 2016, IMAWM.

[31]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.