Retrieval-Based Face Annotation by Weak Label Regularized Local Coordinate Coding

Auto face annotation, which aims to detect human faces from a facial image and assign them proper human names, is a fundamental research problem and beneficial to many real-world applications. In this work, we address this problem by investigating a retrieval-based annotation scheme of mining massive web facial images that are freely available over the Internet. In particular, given a facial image, we first retrieve the top n similar instances from a large-scale web facial image database using content-based image retrieval techniques, and then use their labels for auto annotation. Such a scheme has two major challenges: 1) how to retrieve the similar facial images that truly match the query, and 2) how to exploit the noisy labels of the top similar facial images, which may be incorrect or incomplete due to the nature of web images. In this paper, we propose an effective Weak Label Regularized Local Coordinate Coding (WLRLCC) technique, which exploits the principle of local coordinate coding by learning sparse features, and employs the idea of graph-based weak label regularization to enhance the weak labels of the similar facial images. An efficient optimization algorithm is proposed to solve the WLRLCC problem. Moreover, an effective sparse reconstruction scheme is developed to perform the face annotation task. We conduct extensive empirical studies on several web facial image databases to evaluate the proposed WLRLCC algorithm from different aspects. The experimental results validate its efficacy. We share the two constructed databases "WDB" (714,454 images of 6,025 people) and "ADB" (126,070 images of 1,200 people) with the public. To further improve the efficiency and scalability, we also propose an offline approximation scheme (AWLRLCC) which generally maintains comparable results but significantly reduces the annotation time.

[1]  Harry Shum,et al.  Scalable face image retrieval with identity-based quantization and multi-reference re-ranking , 2010, CVPR.

[2]  Ying He,et al.  Mining social images with distance metric learning for automated image tagging , 2011, WSDM '11.

[3]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[4]  Chunyan Miao,et al.  Learning to name faces: a multimodal learning scheme for search-based face annotation , 2013, SIGIR.

[5]  Ying He,et al.  A unified learning framework for auto face annotation by mining web facial images , 2012, CIKM.

[6]  S. Osher,et al.  Coordinate descent optimization for l 1 minimization with application to compressed sensing; a greedy algorithm , 2009 .

[7]  Michael R. Lyu,et al.  Face Annotation Using Transductive Kernel Fisher Discriminant , 2008, IEEE Transactions on Multimedia.

[8]  LiuWei,et al.  Semi-supervised distance metric learning for collaborative image retrieval and clustering , 2010 .

[9]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[10]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Jun Yang,et al.  Naming every individual in news video monologues , 2004, MULTIMEDIA '04.

[12]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Wei-Ying Ma,et al.  Bipartite graph reinforcement model for web image annotation , 2007, ACM Multimedia.

[14]  John Wright,et al.  Dense Error Correction via L1-Minimization , 2008, 0809.0199.

[15]  Changhu Wang,et al.  Image annotation refinement using random walk with restarts , 2006, MM '06.

[16]  Jian Sun,et al.  Face recognition with learning-based descriptor , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Yihong Gong,et al.  Nonlinear Learning using Local Coordinate Coding , 2009, NIPS.

[18]  Alexander C. Berg,et al.  Who's In the Picture , 2004, NIPS 2004.

[19]  David A. Forsyth,et al.  Animals on the Web , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  ZhuJianke,et al.  Semisupervised SVM batch mode active learning with applications to image retrieval , 2009 .

[21]  Cordelia Schmid,et al.  Face recognition from caption-based supervision , 2010 .

[22]  Rong Jin,et al.  Boosting multi-kernel locality-sensitive hashing for scalable image retrieval , 2012, SIGIR '12.

[23]  Duy-Dinh Le,et al.  Unsupervised Face Annotation by Mining the Web , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[24]  Zhe Wang,et al.  Modeling LSH for performance tuning , 2008, CIKM '08.

[25]  Barbara Caputo,et al.  A Large-Scale Database of Images and Captions for Automatic Face Naming , 2011, BMVC.

[26]  Allan Hanbury,et al.  A survey of methods for image annotation , 2008, J. Vis. Lang. Comput..

[27]  Jianping Fan,et al.  Multi-level annotation of natural scenes using dominant image components and semantic concepts , 2004, MULTIMEDIA '04.

[28]  Li Bai,et al.  Cosine Similarity Metric Learning for Face Verification , 2010, ACCV.

[29]  Zhi-Hua Zhou,et al.  Multi-Label Learning with Weak Label , 2010, AAAI.

[30]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[31]  Yajie Tian,et al.  Handbook of face recognition , 2003 .

[32]  Erik Hjelmås,et al.  Face Detection: A Survey , 2001, Comput. Vis. Image Underst..

[33]  Takeo Kanade,et al.  Name-It: Naming and Detecting Faces in News Videos , 1999, IEEE Multim..

[34]  Gang Wang,et al.  Seeing People in Social Context: Recognizing People and Social Relationships , 2010, ECCV.

[35]  Mingjing Li,et al.  Automated annotation of human faces in family albums , 2003, MULTIMEDIA '03.

[36]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[37]  Pietro Perona,et al.  Unsupervised clustering for google searches of celebrity images , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[38]  Yee Whye Teh,et al.  Names and faces in the news , 2004, CVPR 2004.

[39]  Lei Zhang,et al.  Gabor Feature Based Sparse Representation for Face Recognition with Gabor Occlusion Dictionary , 2010, ECCV.

[40]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[41]  Wesley De Neve,et al.  Collaborative Face Recognition for Improved Face Annotation in Personal Photo Collections Shared on Online Social Networks , 2011, IEEE Transactions on Multimedia.

[42]  Jun Liu,et al.  Efficient Euclidean projections in linear time , 2009, ICML '09.

[43]  Qi Tian,et al.  Multi-label boosting for image annotation by structural grouping sparsity , 2010, ACM Multimedia.

[44]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.

[46]  Yuandong Tian,et al.  EasyAlbum: an interactive photo annotation system based on face clustering and re-ranking , 2007, CHI.

[47]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[48]  Ying He,et al.  Mining Weakly Labeled Web Facial Images for Search-Based Face Annotation , 2011, IEEE Transactions on Knowledge and Data Engineering.

[49]  Yuandong Tian,et al.  A Face Annotation Framework with Partial Clustering and Interactive Labeling , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Laurent Itti,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[51]  Patrik O. Hoyer,et al.  Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[52]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .

[53]  Ramesh C. Jain,et al.  Image annotation by kNN-sparse graph-based label propagation over noisily tagged web images , 2011, TIST.

[54]  Azriel Rosenfeld,et al.  Face recognition: A literature survey , 2003, CSUR.

[55]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[56]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[57]  Pinar Duygulu Sahin,et al.  A Graph Based Approach for Naming Faces in News Photos , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[58]  René Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications , 2012, IEEE transactions on pattern analysis and machine intelligence.

[59]  Harry Shum,et al.  Scalable face image retrieval with identity-based quantization and multi-reference re-ranking , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[60]  Lei Zhang,et al.  Multi-label sparse coding for automatic image annotation , 2009, CVPR.

[61]  Luc Van Gool,et al.  Unsupervised face alignment by robust nonrigid mapping , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[62]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[63]  B. K. Julsing,et al.  Face Recognition with Local Binary Patterns , 2012 .

[64]  Thomas Mensink,et al.  Improving People Search Using Query Expansions , 2008, ECCV.