Locality condensation: a new dimensionality reduction method for image retrieval

Content-based image similarity search plays a key role in multimedia retrieval. Each image is usually represented as a point in a high-dimensional feature space. The key challenge of searching similar images from a large database is the high computational overhead due to the "curse of dimensionality". Reducing the dimensionality is an important means to tackle the problem. In this paper, we study dimensionality reduction for top-k image retrieval. Intuitively, an effective dimensionality reduction method should not only preserve the close locations of similar images (or points), but also separate those dissimilar ones far apart in the reduced subspace. Existing dimensionality reduction methods mainly focused on the former. We propose a novel idea called Locality Condensation (LC) to not only preserve localities determined by neighborhood information and their global similarity relationship, but also ensure that different localities will not invade each other in the low-dimensional subspace. To generate non-overlapping localities in the subspace, LC first performs an elliptical condensation, which condenses each locality with an elliptical shape into a more compact hypersphere to enlarge the margins among different localities and estimate the projection in the subspace for overlap analysis. Through a convex optimization, LC further performs a scaling condensation on the obtained hyperspheres based on their projections in the subspace with minimal condensation degrees. By condensing the localities effectively, the potential overlaps among different localities in the low-dimensional subspace are prevented. Consequently, for similarity search in the subspace, the number of false hits (i.e., distant points that are falsely retrieved) will be reduced. Extensive experimental comparisons with existing methods demonstrate the superiority of our proposal.

[1]  Deng Cai,et al.  Orthogonal locality preserving indexing , 2005, SIGIR '05.

[2]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[3]  Christian Böhm,et al.  Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases , 2001, CSUR.

[4]  Charles Elkan,et al.  Using the Triangle Inequality to Accelerate k-Means , 2003, ICML.

[5]  Beng Chin Ooi,et al.  iDistance: An adaptive B+-tree based indexing method for nearest neighbor search , 2005, TODS.

[6]  Forrest W. Young Multidimensional Scaling: History, Theory, and Applications , 1987 .

[7]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[8]  Joshua B. Tenenbaum,et al.  Global Versus Local Methods in Nonlinear Dimensionality Reduction , 2002, NIPS.

[9]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[10]  Jiawei Han,et al.  Spectral regression: a unified subspace learning framework for content-based image retrieval , 2007, ACM Multimedia.

[11]  Kun Zhou,et al.  Laplacian optimal design for image retrieval , 2007, SIGIR.

[12]  Divyakant Agrawal,et al.  A comparison of DFT and DWT based similarity search in time-series databases , 2000, CIKM '00.

[13]  Daniel A. Keim,et al.  Similarity search in multimedia databases , 2004, Proceedings. 20th International Conference on Data Engineering.

[14]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[15]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[16]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[17]  Sharad Mehrotra,et al.  Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces , 2000, VLDB.

[18]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[19]  Jiawei Han,et al.  Learning a Maximum Margin Subspace for Image Retrieval , 2008, IEEE Transactions on Knowledge and Data Engineering.

[20]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[21]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[22]  Jiri Matas,et al.  Improving Descriptors for Fast Tree Matching by Optimal Linear Projection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Wei-Ying Ma,et al.  Locality preserving indexing for document representation , 2004, SIGIR '04.

[24]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[25]  Aoying Zhou,et al.  An adaptive and dynamic dimensionality reduction method for high-dimensional indexing , 2007, The VLDB Journal.

[26]  Qi Tian,et al.  Learning image manifolds by semantic subspace projection , 2006, MM '06.

[27]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[28]  Stefan M. Rüger,et al.  Automated Image Annotation Using Global Features and Robust Nonparametric Density Estimation , 2005, CIVR.

[29]  Wei-Ying Ma,et al.  Locality preserving clustering for image database , 2004, MULTIMEDIA '04.