Transductive Kernel Map Learning and Its Application Image Annotation

We introduce in this paper a novel image annotation approach based on maximum margin classification and a new class of kernels. The method goes beyond the naive use of existing kernels and their restricted combinations in order to design “model-free“ transductive kernels applicable to interconnected image databases. The main contribution of our method includes the minimization of an energy function mixing i) a reconstruction term that factorizes a matrix of interconnected image data as a product of a learned dictionary and a learned kernel map ii) a fidelity term that ensures consistent label predictions with those provided in a training set and iii) a smoothness term which guarantees similar labels for neighboring data and allows us to iteratively diffuse kernel maps and labels from labeled to unlabeled images. Solving this minimization problem makes it possible to learn both a decision criterion and a kernel map that guarantee linear separability in a high dimensional space and good generalization performance. Experiments conducted on image annotation, show that our obtained kernel achieves at least comparable results with related state of the art methods on the MSRC and the Corel5k databases.

[1]  Inderjit S. Dhillon,et al.  Metric and Kernel Learning Using a Linear Transformation , 2009, J. Mach. Learn. Res..

[2]  Inderjit S. Dhillon,et al.  Inductive Regularized Learning of Kernel Functions , 2010, NIPS.

[3]  Nicu Sebe,et al.  Exploiting the entire feature space with sparsity for automatic image annotation , 2011, ACM Multimedia.

[4]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[5]  Bernhard Schölkopf,et al.  A Direct Method for Building Sparse Kernel Learning Algorithms , 2006, J. Mach. Learn. Res..

[6]  Jing Liu,et al.  Image annotation using multi-correlation probabilistic matrix factorization , 2010, ACM Multimedia.

[7]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[10]  Mikhail Belkin,et al.  Laplacian Support Vector Machines Trained in the Primal , 2009, J. Mach. Learn. Res..

[11]  Fei-Fei Li,et al.  What Does Classifying More Than 10, 000 Image Categories Tell Us? , 2010, ECCV.

[12]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[13]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[14]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[15]  Antonio Torralba,et al.  Semi-Supervised Learning in Gigantic Image Collections , 2009, NIPS.

[16]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[17]  H. Sahbi,et al.  Transductive kernel learning Apprentissage transductif des noyaux , 2012 .

[18]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[19]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[20]  Ali Farhadi,et al.  Attribute-centric recognition for cross-category generalization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[22]  Fei-Fei Li,et al.  What, Where and Who? Telling the Story of an Image by Activity Classification, Scene Recognition and Object Categorization , 2010, Computer Vision: Detection, Recognition and Reconstruction.

[23]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Lihi Zelnik-Manor,et al.  Large Scale Max-Margin Multi-Label Classification with Priors , 2010, ICML.

[25]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[26]  Amir Dembo,et al.  A minimum discrimination information approach for hidden Markov modeling , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[28]  Boonserm Kijsirikul,et al.  A new kernelization framework for Mahalanobis distance learning algorithms , 2010, Neurocomputing.

[29]  Dong Liu,et al.  Unified tag analysis with multi-edge graph , 2010, ACM Multimedia.

[30]  Yueting Zhuang,et al.  Image annotation by composite kernel learning with group structure , 2011, ACM Multimedia.

[31]  Lei Zhang,et al.  Multi-label sparse coding for automatic image annotation , 2009, CVPR.

[32]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[33]  Li Fei-Fei,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Fei-Fei Li,et al.  Hierarchical semantic indexing for large scale image retrieval , 2011, CVPR 2011.

[35]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[36]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  S. V. N. Vishwanathan,et al.  Efficient max-margin multi-label classification with applications to zero-shot learning , 2012, Machine Learning.

[38]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[39]  Manik Varma,et al.  More generality in efficient multiple kernel learning , 2009, ICML '09.

[40]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[41]  Fei-Fei Li,et al.  Attribute Learning in Large-Scale Datasets , 2010, ECCV Workshops.

[42]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[43]  Samy Bengio,et al.  A Discriminative Kernel-Based Approach to Rank Images from Text Queries , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Yang Yu,et al.  Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Fei-Fei Li,et al.  Towards total scene understanding: Classification, annotation and segmentation in an automatic framework , 2009, CVPR.

[46]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[47]  R. Manmatha,et al.  An Inference Network Approach to Image Retrieval , 2004, CIVR.

[48]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[50]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[51]  Shuicheng Yan,et al.  Towards multi-semantic image annotation with graph regularized exclusive group lasso , 2011, MM '11.

[52]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.