Large-Scale Image Annotation by Efficient and Robust Kernel Metric Learning

One of the key challenges in search-based image annotation models is to define an appropriate similarity measure between images. Many kernel distance metric learning (KML) algorithms have been developed in order to capture the nonlinear relationships between visual features and semantics of the images. One fundamental limitation in applying KML to image annotation is that it requires converting image annotations into binary constraints, leading to a significant information loss. In addition, most KML algorithms suffer from high computational cost due to the requirement that the learned matrix has to be positive semi-definitive (PSD). In this paper, we propose a robust kernel metric learning (RKML) algorithm based on the regression technique that is able to directly utilize image annotations. The proposed method is also computationally more efficient because PSD property is automatically ensured by regression. We provide the theoretical guarantee for the proposed algorithm, and verify its efficiency and effectiveness for image annotation by comparing it to state-of-the-art approaches for both distance metric learning and image annotation.

[1]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[3]  Yuesheng Xu,et al.  Universal Kernels , 2006, J. Mach. Learn. Res..

[4]  Danielle S. McNamara,et al.  Handbook of latent semantic analysis , 2007 .

[5]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Rong Jin,et al.  Efficient Kernel Clustering Using Random Fourier Features , 2012, 2012 IEEE 12th International Conference on Data Mining.

[7]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Masashi Sugiyama,et al.  Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis , 2007, J. Mach. Learn. Res..

[9]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[11]  Liang-Tien Chia,et al.  Automatic image tagging via category label and web data , 2010, ACM Multimedia.

[12]  Daphna Weinshall,et al.  Learning a kernel function for classification with small training samples , 2006, ICML.

[13]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[14]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  G. Baudat,et al.  Generalized Discriminant Analysis Using a Kernel Approach , 2000, Neural Computation.

[16]  Marcel Worring,et al.  Learning Social Tag Relevance by Neighbor Voting , 2009, IEEE Transactions on Multimedia.

[17]  Samy Bengio,et al.  Large Scale Online Learning of Image Similarity through Ranking , 2009, IbPRIA.

[18]  S. Smale,et al.  Geometry on Probability Spaces , 2009 .

[19]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[20]  Rong Jin,et al.  Regularized Distance Metric Learning: Theory and Algorithm , 2009, NIPS.

[21]  Tomer Hertz,et al.  Boosting margin based distance functions for clustering , 2004, ICML.

[22]  Gabriela Csurka,et al.  Learning structured prediction models for interactive image labeling , 2011, CVPR 2011.

[23]  Michael W. Berry,et al.  Understanding search engines: mathematical modeling and text retrieval (software , 1999 .

[24]  Jieping Ye,et al.  Nonlinear adaptive distance metric learning for clustering , 2007, KDD '07.

[25]  Ying He,et al.  Mining social images with distance metric learning for automated image tagging , 2011, WSDM '11.

[26]  Nenghai Yu,et al.  Distance metric learning from uncertain side information with application to automated photo tagging , 2009, ACM Multimedia.

[27]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[28]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[29]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[30]  Kilian Q. Weinberger,et al.  Metric Learning for Kernel Regression , 2007, AISTATS.

[31]  Jianping Fan,et al.  Multi-level annotation of natural scenes using dominant image components and semantic concepts , 2004, MULTIMEDIA '04.

[32]  Tomer Hertz,et al.  Learning a Mahalanobis Metric from Equivalence Constraints , 2005, J. Mach. Learn. Res..

[33]  Weidong Yang,et al.  Labeling Images by Integrating Sparse Multiple Distance Learning and Semantic Context Modeling , 2012, ECCV.

[34]  Lei Wang,et al.  Positive Semidefinite Metric Learning with Boosting , 2009, NIPS.

[35]  Lorenzo Torresani,et al.  Large Margin Component Analysis , 2006, NIPS.

[36]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[37]  Lei Wu,et al.  Tag Completion for Image Retrieval , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Petros Drineas,et al.  On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..