Using extended feature objects for partial similarity retrieval

Abstract. In this paper, we introduce the concept of extended feature objects for similarity retrieval. Conventional approaches for similarity search in databases map each object in the database to a point in some high-dimensional feature space and define similarity as some distance measure in this space. For many similarity search problems, this feature-based approach is not sufficient. When retrieving partially similar polygons, for example, the search cannot be restricted to edge sequences, since similar polygon sections may start and end anywhere on the edges of the polygons. In general, inherently continuous problems such as the partial similarity search cannot be solved by using point objects in feature space. In our solution, we therefore introduce extended feature objects consisting of an infinite set of feature points. For an efficient storage and retrieval of the extended feature objects, we determine the minimal bounding boxes of the feature objects in multidimensional space and store these boxes using a spatial access structure. In our concrete polygon problem, sets of polygon sections are mapped to 2D feature objects in high-dimensional space which are then approximated by minimal bounding boxes and stored in an R $^*$-tree. The selectivity of the index is improved by using an adaptive decomposition of very large feature objects and a dynamic joining of small feature objects. For the polygon problem, translation, rotation, and scaling invariance is achieved by using the Fourier-transformed curvature of the normalized polygon sections. In contrast to vertex-based algorithms, our algorithm guarantees that no false dismissals may occur and additionally provides fast search times for realistic database sizes. We evaluate our method using real polygon data of a supplier for the car manufacturing industry.

[1]  H. V. Jagadish,et al.  A retrieval technique for similar shapes , 1991, SIGMOD '91.

[2]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[3]  Helmut Alt,et al.  Resemblance and Symmetries of Geometric Patterns , 1992, Data Structures and Efficient Algorithms.

[4]  Jack A. Orenstein A comparison of spatial query processing techniques for native and parameter spaces , 1990, SIGMOD '90.

[5]  Rajiv Mehrotra,et al.  Feature-based retrieval of similar shapes , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[6]  Euripides G. M. Petrakis,et al.  Similarity Searching in Large Image DataBases , 1994 .

[7]  Christos Faloutsos,et al.  The A dynamic index for multidimensional ob-jects , 1987, Very Large Data Bases Conference.

[8]  Cristina Meinecke,et al.  Chapter 2 Perceptual organization of visual patterns: The segmentation of textures , 1996 .

[9]  Y. J. Tejwani,et al.  Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[10]  Robert A. Hummel,et al.  Massively parallel model matching: geometric hashing on the Connection Machine , 1992, Computer.

[11]  Rajiv Mehrotra,et al.  Feature-Index-Based Similar Shape Retrieval , 1997, VDB.

[12]  Hans-Peter Kriegel Query Processing in Spatial Database Systems , 1991, New Results and New Trends in Computer Science.

[13]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[14]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[15]  F. Stein,et al.  Efficient two dimensional object recognition , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[16]  H. V. Jagadzsh Linear Clustering of Objects with Multiple Attributes , 1998 .

[17]  K. Wakimoto,et al.  Efficient and Effective Querying by Image Content , 1994 .

[18]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[19]  Hans-Peter Kriegel,et al.  Comparison of approximations of complex objects used for approximation-based query processing in spatial database systems , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[20]  P. Wintz,et al.  An efficient three-dimensional aircraft recognition algorithm using normalized fourier descriptors , 1980 .

[21]  Euripides G. M. Petrakis,et al.  Similarity searching in large image database , 1994 .

[22]  Oliver Günther,et al.  The design of the cell tree: an object-oriented index structure for geometric databases , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[23]  SeppänenTapio,et al.  An Experimental Comparison of Autoregressive and Fourier-Based Descriptors in 2D Shape Classification , 1995 .

[24]  H. V. Jagadish Spatial search with polyhedra , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[25]  Michael Freeston,et al.  The BANG file: A new kind of grid file , 1987, SIGMOD '87.

[26]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[27]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[28]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[29]  Hans-Peter Kriegel,et al.  The Buddy-Tree: An Efficient and Robust Access Method for Spatial Data Base Systems , 1990, VLDB.

[30]  Matti Pietikäinen,et al.  An Experimental Comparison of Autoregressive and Fourier-Based Descriptors in 2D Shape Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[31]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.