Cooperative Sparse Representation in Two Opposite Directions for Semi-Supervised Image Annotation

Recent studies have shown that sparse representation (SR) can deal well with many computer vision problems, and its kernel version has powerful classification capability. In this paper, we address the application of a cooperative SR in semi-supervised image annotation which can increase the amount of labeled images for further use in training image classifiers. Given a set of labeled (training) images and a set of unlabeled (test) images, the usual SR method, which we call forward SR, is used to represent each unlabeled image with several labeled ones, and then to annotate the unlabeled image according to the annotations of these labeled ones. However, to the best of our knowledge, the SR method in an opposite direction, that we call backward SR to represent each labeled image with several unlabeled images and then to annotate any unlabeled image according to the annotations of the labeled images which the unlabeled image is selected by the backward SR to represent, has not been addressed so far. In this paper, we explore how much the backward SR can contribute to image annotation, and be complementary to the forward SR. The co-training, which has been proved to be a semi-supervised method improving each other only if two classifiers are relatively independent, is then adopted to testify this complementary nature between two SRs in opposite directions. Finally, the co-training of two SRs in kernel space builds a cooperative kernel sparse representation (Co-KSR) method for image annotation. Experimental results and analyses show that two KSRs in opposite directions are complementary, and Co-KSR improves considerably over either of them with an image annotation performance better than other state-of-the-art semi-supervised classifiers such as transductive support vector machine, local and global consistency, and Gaussian fields and harmonic functions. Comparative experiments with a nonsparse solution are also performed to show that the sparsity plays an important role in the cooperation of image representations in two opposite directions. This paper extends the application of SR in image annotation and retrieval.

[1]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[3]  Liang-Tien Chia,et al.  Local features are not lonely – Laplacian sparse coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Hervé Glotin,et al.  Diversifying Image Retrieval with Affinity-Propagation Clustering on Visual Manifolds , 2009, IEEE MultiMedia.

[5]  James Ze Wang,et al.  Real-Time Computerized Annotation of Pictures , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Meng Wang,et al.  Beyond Distance Measurement: Constructing Neighborhood Similarity for Video Annotation , 2009, IEEE Transactions on Multimedia.

[7]  Jing Liu,et al.  Image annotation via graph learning , 2009, Pattern Recognit..

[8]  Wei Jia,et al.  Discriminant sparse neighborhood preserving embedding for face recognition , 2012, Pattern Recognit..

[9]  Cordelia Schmid,et al.  Semi-Local Affine Parts for Object Recognition , 2004, BMVC.

[10]  Michael Elad,et al.  Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[11]  M. Salman Asif,et al.  Dynamic Updating for ` 1 Minimization , 2009 .

[12]  Justin K. Romberg,et al.  Dynamic Updating for $\ell_{1}$ Minimization , 2009, IEEE Journal of Selected Topics in Signal Processing.

[13]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[14]  권홍우,et al.  Bootstrapping , 2002, ACL.

[15]  Ahmed A. Alani Principal component analysis in statistics , 2014 .

[16]  Hossein Mobahi,et al.  Toward a Practical Face Recognition System: Robust Alignment and Illumination by Sparse Representation , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Yang Yu,et al.  Automatic image annotation using group sparsity , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[19]  Yaakov Tsaig,et al.  Fast Solution of $\ell _{1}$ -Norm Minimization Problems When the Solution May Be Sparse , 2008, IEEE Transactions on Information Theory.

[20]  Liang-Tien Chia,et al.  Kernel Sparse Representation for Image Classification and Face Recognition , 2010, ECCV.

[21]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Meng Wang,et al.  Unified Video Annotation via Multigraph Learning , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Shuicheng Yan,et al.  Multi-label sparse coding for automatic image annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[26]  Vincent Lepetit,et al.  Are sparse representations really relevant for image classification? , 2011, CVPR 2011.

[27]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[28]  Guillermo Sapiro,et al.  Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Cordelia Schmid,et al.  A sparse texture representation using affine-invariant regions , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[30]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jing Peng,et al.  Composite kernels for semi-supervised clustering , 2011, Knowledge and Information Systems.

[32]  Christopher K. I. Williams,et al.  Pascal Visual Object Classes Challenge Results , 2005 .

[33]  Douglas A. Reynolds,et al.  The NIST speaker recognition evaluation - Overview, methodology, systems, results, perspective , 2000, Speech Commun..

[34]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[35]  Jingrui He,et al.  Generalized Manifold-Ranking-Based Image Retrieval , 2006, IEEE Transactions on Image Processing.

[36]  B. S. Manjunath,et al.  An efficient color representation for image retrieval , 2001, IEEE Trans. Image Process..

[37]  Ramin Zabih,et al.  Comparing images using color coherence vectors , 1997, MULTIMEDIA '96.

[38]  Thomas Deselaers,et al.  The Visual Concept Detection Task in ImageCLEF 2008 , 2008, CLEF.

[39]  Yao Zhao,et al.  Co-training for search-based automatic image annotation , 2008, J. Digit. Inf. Manag..

[40]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[41]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[43]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[44]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[45]  Mandava Rajeswari,et al.  A survey of the state of the art in learning the kernels , 2012, Knowledge and Information Systems.

[46]  James E. Fowler,et al.  Compressive-Projection Principal Component Analysis , 2009, IEEE Transactions on Image Processing.

[47]  Yang Yan,et al.  Semi-supervised fuzzy co-clustering algorithm for document categorization , 2011, Knowledge and Information Systems.

[48]  Tobias Scheffer,et al.  Multi-Relational Learning, Text Mining, and Semi-Supervised Learning for Functional Genomics , 2004, Machine Learning.

[49]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[50]  Yao Zhao,et al.  TSVM-HMM: Transductive SVM based hidden Markov model for automatic image annotation , 2009, Expert Syst. Appl..

[51]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[52]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.