Fisher Vector Faces in the Wild

Several recent papers on automatic face verification have significantly raised the performance bar by developing novel, specialised representations that outperform standard features such as SIFT for this problem. This paper makes two contributions: first, and somewhat surprisingly, we show that Fisher vectors on densely sampled SIFT features, i.e. an off-the-shelf object recognition representation, are capable of achieving state-of-the-art face verification performance on the challenging “Labeled Faces in the Wild” benchmark; second, since Fisher vectors are very high dimensional, we show that a compact descriptor can be learnt from them using discriminative metric learning. This compact descriptor has a better recognition accuracy and is very well suited to large scale identification tasks.

[1]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[2]  Tal Hassner,et al.  Multiple One-Shots for Utilizing Class Label Information , 2009, BMVC.

[3]  Cordelia Schmid,et al.  Multiple Instance Metric Learning from Automatically Labeled Bags of Faces , 2010, ECCV.

[4]  Jian Sun,et al.  Blessing of Dimensionality: High-Dimensional Feature and Its Efficient Compression for Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Frédéric Jurie,et al.  Modeling spatial layout with fisher vectors for image categorization , 2011, 2011 International Conference on Computer Vision.

[6]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Gang Hua,et al.  Probabilistic Elastic Matching for Pose Variant Face Verification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[10]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[11]  Yaniv Taigman,et al.  Descriptor Based Methods in the Wild , 2008 .

[12]  Thomas S. Huang,et al.  Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[13]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[14]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[15]  Florent Perronnin,et al.  High-dimensional signature compression for large-scale image classification , 2011, CVPR 2011.

[16]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[17]  Nicolas Pinto,et al.  Beyond simple features: A large-scale feature search approach to unconstrained face recognition , 2011, Face and Gesture 2011.

[18]  Jian Sun,et al.  Face recognition with learning-based descriptor , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .

[20]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[21]  Luc Van Gool,et al.  Real-time facial feature detection using conditional regression forests , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  David J. Kriegman,et al.  Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection , 1996, ECCV.

[23]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Umar Mohammed,et al.  Probabilistic Models for Inference about Identity , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Andrew Zisserman,et al.  Descriptor Learning Using Convex Optimisation , 2012, ECCV.

[26]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[27]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[28]  Lior Wolf,et al.  Leveraging Billions of Faces to Overcome Performance Barriers in Unconstrained Face Recognition , 2011, ArXiv.

[29]  Gaurav Sharma,et al.  Local Higher-Order Statistics (LHS) for Texture Categorization and Facial Analysis , 2012, ECCV.

[30]  Deva Ramanan,et al.  Face detection, pose estimation, and landmark localization in the wild , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Florent Perronnin,et al.  Modeling the spatial layout of images beyond spatial pyramids , 2012, Pattern Recognit. Lett..

[32]  Nicolas Pinto,et al.  How far can you get with a modern face recognition test set using only simple features? , 2009, CVPR.

[33]  Shenghuo Zhu,et al.  Large Scale Strongly Supervised Ensemble Metric Learning, with Applications to Face Verification and Retrieval , 2012, ArXiv.

[34]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[35]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Peng Li,et al.  Distance Metric Learning with Eigenvalue Optimization , 2012, J. Mach. Learn. Res..

[37]  Peter N. Belhumeur,et al.  Tom-vs-Pete Classifiers and Identity-Preserving Alignment for Face Verification , 2012, BMVC.

[38]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[39]  Li Bai,et al.  Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[40]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[41]  Andrew Zisserman,et al.  Taking the bite out of automated naming of characters in TV video , 2009, Image Vis. Comput..

[42]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[43]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[44]  Tal Hassner,et al.  Similarity Scores Based on Background Samples , 2009, ACCV.