Unlabeled data improvesword prediction

Labeling image collections is a tedious task, especially when multiple labels have to be chosen for each image. In this paper we introduce a new framework that extends state of the art models in word prediction to incorporate information from unlabeled examples, using manifold regularization. To the best of our knowledge this is the first semi-supervised multi-task model used in vision problems. The new model can be solved using gradient descent and is fast and efficient. We show remarkable improvements for cases with few labeled examples for challenging multi-task learning problems in vision (predicting words for images and attributes for objects).

[1]  Vikas Sindhwani,et al.  The Geometric Basis of Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[2]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[3]  Jorma Laaksonen,et al.  Evaluating the performance in automatic image annotation: Example case by adaptive fusion of global image features , 2007, Signal Process. Image Commun..

[4]  Adil Alpkocak,et al.  Combining textual and visual clusters for semantic image retrieval and auto-annotation , 2005 .

[5]  Andrew Zisserman,et al.  A Statistical Approach to Texture Classification from Single Images , 2004, International Journal of Computer Vision.

[6]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[7]  Yuxiao Hu,et al.  Face recognition using Laplacianfaces , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Trevor Darrell,et al.  Learning Visual Representations using Images with Captions , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  R. Manmatha,et al.  An Inference Network Approach to Image Retrieval , 2004, CIVR.

[10]  Xiaojin Zhu,et al.  Semi-Supervised Learning Literature Survey , 2005 .

[11]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  R. Manmatha,et al.  A Model for Learning the Semantics of Pictures , 2003, NIPS.

[13]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[14]  Ali Farhadi,et al.  Scene Discovery by Matrix Factorization , 2008, ECCV.

[15]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[16]  Jason Weston,et al.  Large-scale kernel machines , 2007 .

[17]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[18]  David A. Forsyth,et al.  ManifoldBoost: stagewise function approximation for fully-, semi- and un-supervised learning , 2008, ICML '08.

[19]  Alexander Zien,et al.  Semi-Supervised Classification by Low Density Separation , 2005, AISTATS.

[20]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[21]  Tong Zhang,et al.  A High-Performance Semi-Supervised Learning Method for Text Chunking , 2005, ACL.

[22]  Bernhard Schölkopf,et al.  Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[23]  Shimon Ullman,et al.  Uncovering shared structures in multiclass classification , 2007, ICML '07.

[24]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[25]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[26]  Gustavo Carneiro,et al.  Formulating semantic image annotation as a supervised learning problem , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Stefan M. Rüger,et al.  Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation , 2005, CIVR.

[28]  Vladimir Pavlovic,et al.  A New Baseline for Image Annotation , 2008, ECCV.

[29]  Lawrence Carin,et al.  Semi-Supervised Multitask Learning , 2007, NIPS.

[30]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[31]  R. Manmatha,et al.  Using Maximum Entropy for Automatic Image Annotation , 2004, CIVR.

[32]  Y. Mori,et al.  Image-to-word transformation based on dividing and vector quantizing images with words , 1999 .