The TV-tree: An index structure for high-dimensional data

We propose a file structure to index high-dimensionality data, which are typically points in some feature space. The idea is to use only a few of the features, using additional features only when the additional discriminatory power is absolutely necessary. We present in detail the design of our tree structure and the associated algorithms that handle such “varying length” feature vectors. Finally, we report simulation results, comparing the proposed structure with theR*-tree, which is one of the most successful methods for low-dimensionality spaces.The results illustrate the superiority of our method, which saves up to 80% in disk accesses.

[1]  Mary Beth Ruskai,et al.  Wavelets and their Applications , 1992 .

[2]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[3]  Bruce W. Weide,et al.  Optimal Expected-Time Algorithms for Closest Point Problems , 1980, TOMS.

[4]  Hanan Samet,et al.  A qualitative comparison study of data structures for large line segment databases , 1992, SIGMOD '92.

[5]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1991, CACM.

[6]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[7]  Gerard Salton,et al.  Generation and search of clustered files , 1978, TODS.

[8]  P. Venkat Rangan,et al.  Multimedia conferencing in the Etherphone environment , 1991, Computer.

[9]  Manfred Schroeder,et al.  Fractals, Chaos, Power Laws: Minutes From an Infinite Paradise , 1992 .

[10]  P. A. Blight The Analysis of Time Series: An Introduction , 1991 .

[11]  C. Faloutsos Eecient Similarity Search in Sequence Databases , 1993 .

[12]  Stavros Christodoulakis,et al.  Multimedia Information Systems: The Unfolding of a Reality (Guest Editors' Introduction) , 1991, Computer.

[13]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.

[14]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[15]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[16]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[17]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[18]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[19]  H. V. Jagadish Spatial search with polyhedra , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[20]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[21]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[22]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[23]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[24]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[25]  Karen Kukich,et al.  Techniques for automatically correcting words in text , 1992, CSUR.

[26]  Diane Greene,et al.  An implementation and performance analysis of spatial data access methods , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[27]  Peter Willett,et al.  Automatic Spelling Correction Using a Trigram Similarity Measure , 1983, Inf. Process. Manag..

[28]  Franz Aurenhammer,et al.  Voronoi diagrams—a survey of a fundamental geometric data structure , 1991, CSUR.

[29]  C. K. Yuen,et al.  Digital Filters , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[30]  H. V. Jagadish,et al.  A retrieval technique for similar shapes , 1991, SIGMOD '91.

[31]  Kenneth Steiglitz,et al.  Operations on Images Using Quad Trees , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[33]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[34]  Frank Manola,et al.  PROBE Spatial Data Modeling and Query Processing in an Image Database Application , 1988, IEEE Trans. Software Eng..

[35]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[36]  Forest Baskett,et al.  An Algorithm for Finding Nearest Neighbors , 1975, IEEE Transactions on Computers.