SOM-based R*-tree for similarity retrieval

Feature-based similarity retrieval has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects (e.g., documents, images, video, music score, etc.). For example, images are represented by their color histograms, texture vectors, and shape descriptors. A feature vector is a vector that represents a set of features, and are usually high-dimensional data. The performance of conventional multidimensional data structures (e.g., R-tree family K-D-B tree, grid file, TV-tree) tends to deteriorate as the number of dimensions of feature vectors increases. The R*-tree is the most successful variant of the R-tree. We propose a SOM-based R*-tree as a new indexing method for high-dimensional feature vectors. The SOM-based R*-tree combines SOM and R*-tree to achieve search performance more scalable to high dimensionalities. Self-organizing maps (SOMs) provide mapping from high-dimensional feature vectors onto a two-dimensional space. The mapping preserves the topology of the feature vectors. The map is called a topological feature map, and preserves the mutual relationships (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. We experimentally compare the retrieval time cost of a SOM-based R*-tree with that of an SOM and an R*-tree using color feature vectors extracted from 40,000 images.

[1]  A. Tversky Features of Similarity , 1977 .

[2]  B. S. Manjunath,et al.  Image indexing using a texture dictionary , 1995, Other Conferences.

[3]  Hans-Peter Kriegel,et al.  The pyramid-technique: towards breaking the curse of dimensionality , 1998, SIGMOD '98.

[4]  Erkki Oja,et al.  PicSOM - content-based image retrieval with self-organizing maps , 2000, Pattern Recognit. Lett..

[5]  Shih-Fu Chang,et al.  Image Retrieval: Current Techniques, Promising Directions, and Open Issues , 1999, J. Vis. Commun. Image Represent..

[6]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[7]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[8]  WaveletsElif Albuz,et al.  Scalable Image Indexing and Retrieval using , 1998 .

[9]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[10]  Erkki Oja,et al.  Kohonen Maps , 1999, Encyclopedia of Machine Learning.

[11]  S. Pizer,et al.  The Image Processing Handbook , 1994 .

[12]  Stéphane Mallat,et al.  Multifrequency channel decompositions of images and wavelet models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[13]  Chahab Nastar,et al.  Efficient query refinement for image retrieval , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[14]  Vijay V. Raghavan,et al.  Content-Based Image Retrieval Systems - Guest Editors' Introduction , 1995, Computer.

[15]  Richard Bellman,et al.  Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.

[16]  Jian-Kang Wu Content-Based Indexing of Multimedia Databases , 1997, IEEE Trans. Knowl. Data Eng..

[17]  Andreas Rauber LabelSOM: on the labeling of self-organizing maps , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[18]  Ashfaq A. Khokhar,et al.  Scalable Color Image Indexing and Retrieval Using Vector Wavelets , 2001, IEEE Trans. Knowl. Data Eng..

[19]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[20]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[21]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[22]  Jorma Laaksonen,et al.  SOM_PAK: The Self-Organizing Map Program Package , 1996 .

[23]  Simone Santini,et al.  Similarity Matching , 1995, ACCV.

[24]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[25]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[26]  Teuvo Kohonen,et al.  Exploration of very large databases by self-organizing maps , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[27]  A. Guttman,et al.  A Dynamic Index Structure for Spatial Searching , 1984, SIGMOD 1984.

[28]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[29]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[30]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[31]  Akifumi Makinouchi,et al.  Image classification and retrieval based on wavelet-SOM , 1999, Proceedings 1999 International Symposium on Database Applications in Non-Traditional Environments (DANTE'99) (Cat. No.PR00496).

[32]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[33]  Ramesh C. Jain,et al.  Similarity indexing: algorithms and performance , 1996, Electronic Imaging.

[34]  Samuel Kaski,et al.  Fast winner search for SOM-based monitoring and retrieval of high-dimensional data , 1999 .

[35]  T. Kohonen,et al.  Visual Explorations in Finance with Self-Organizing Maps , 1998 .

[36]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[37]  Samuel Kaski,et al.  Dimensionality reduction by random mapping: fast similarity computation for clustering , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[38]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[39]  Chung-Lin Huang,et al.  A content-based image retrieval system , 1998, Image Vis. Comput..

[40]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[41]  Sharad Mehrotra,et al.  High dimensional feature indexing using hybrid trees , 1998, ICDE 1998.

[42]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[43]  David Salesin,et al.  Fast multiresolution image querying , 1995, SIGGRAPH.

[44]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[45]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[46]  Teuvo Kohonen,et al.  Self-Organization of Very Large Document Collections: State of the Art , 1998 .

[47]  James Ze Wang,et al.  Content-based image indexing and searching using Daubechies' wavelets , 1998, International Journal on Digital Libraries.

[48]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.