Integrating image clustering and memory indexing for large scale content-based image retrieval

Content-Based Image Retrieval (CBIR) is an important research topic of information retrieval, involved in computer graphics, image processing, data mining and pattern recognizing. To make content-based image retrieval suitable large-scale image database, we develop an effective dynamic hierarchical clustering index scheme. Although this system uses a hierarchical clustering technology, with the increasing in the number of cluster centers, it is slow to find the centers, and it becomes a system performance bottleneck. In this paper, content features of image memory indexing is built. This method effectively improves the retrieval speed without loss of the precision. Moreover, the clustering model was improved, integrating the content features and textual features of image, which greatly improve the accuracy of the clustering, thus significantly improves the system precision.

[1]  Haibin Ling,et al.  An Efficient Earth Mover's Distance Algorithm for Robust Histogram Comparison , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Divyakant Agrawal,et al.  Approximate nearest neighbor searching in multimedia databases , 2001, Proceedings 17th International Conference on Data Engineering.

[3]  C. Tomasi The Earth Mover's Distance, Multi-Dimensional Scaling, and Color-Based Image Retrieval , 1997 .

[4]  Wei-Ying Ma,et al.  Hierarchical clustering of WWW image search results using visual, textual and link information , 2004, MULTIMEDIA '04.

[5]  Carlo Tomasi,et al.  Perceptual metrics for image database navigation , 1999 .

[6]  Harpreet Sawhney,et al.  Efficient color histogram indexing , 1994, Proceedings of 1st International Conference on Image Processing.

[7]  Ira Assent,et al.  Approximation Techniques for Indexing the Earth Mover’s Distance in Multimedia Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[8]  James Lee Hafner,et al.  Efficient Color Histogram Indexing for Quadratic Form Distance Functions , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.