Efficient Locality Sensitive Clustering in Multimedia Retrieval

Clustering is fundamental in multimedia retrieval. For example, visual features of high dimensionality are extracted and clustered for image content analysis in image retrieval, scene classification and object retrieval applications. Existing clustering methods suffer from the curse of dimensionality when data are high dimensional and especially in large scale. Locality Sensitive Hashing (LSH) [1] is proposed to remove this problem in high-dimensional indexing, which is the most popular indexing schema in multimedia. We propose an approximate clustering method named Locality Sensitive Clustering (LSC) for high dimensional data in large scale situations. LSC uses pivots to estimate similarities between data points and generates clusters based on the Locality Sensitive Hashing scheme. Experiments on open datasets show that LSC achieve significant improvement on clustering efficiency (i.e. in magnitudes) with little loss of accuracy compared to the state of the art methods.

[1]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[2]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[3]  Loong Fah Cheong,et al.  Randomized Locality Sensitive Vocabularies for Bag-of-Features Model , 2010, ECCV.

[4]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[5]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[6]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[7]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Driss Aboutajdine,et al.  An efficient high-dimensional indexing method for content-based retrieval in large image databases , 2009, Signal Process. Image Commun..

[10]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[11]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[12]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[13]  Rongrong Ji,et al.  Vocabulary hierarchy optimization for effective and transferable retrieval , 2009, CVPR.