Graph-based dimensionality reduction for KNN-based image annotation

KNN-based image annotation method is proved to be very successful. However, it suffers from two issues: (1) high computational cost; (2) the difficulty of finding semantically similar images. In this paper, we propose a graph-based dimensionality reduction method to solve the two problems by adapting the locality sensitive discriminant analysis method [1] to multi-label setting. We first determine relevant and irrelevant images based on label information and construct relevant and irrelevant graphs by focusing on the visually similar relevant and irrelevant images. A linear feature transformation matrix is derived by considering the two graphs. The transformation can map the images to a low-dimensional subspace in which neighborhood relevant images are pulled closer while irrelevant images are pushed away. Thus the new feature after dimensionality reduction is quite fit for KNN-based image annotation. Experiments on the Corel dataset also demonstrate the effectiveness of our dimensionality reduction method for KNN-based image annotation.

[1]  Jieping Ye,et al.  Canonical Correlation Analysis for Multilabel Classification: A Least-Squares Formulation, Extensions, and Analysis , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Cordelia Schmid,et al.  Coloring Local Feature Extraction , 2006, ECCV.

[3]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[4]  Chris H. Q. Ding,et al.  Multi-label Linear Discriminant Analysis , 2010, ECCV.

[5]  Kun Zhou,et al.  Locality Sensitive Discriminant Analysis , 2007, IJCAI.

[6]  Vladimir Pavlovic,et al.  Baselines for Image Annotation , 2010, International Journal of Computer Vision.

[7]  Wei-Ying Ma,et al.  AnnoSearch: Image Auto-Annotation by Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[9]  James Ze Wang,et al.  Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach , 2003, IEEE Trans. Pattern Anal. Mach. Intell..