Indexing Large Collections of Tumor-Like Shapes

We investigated the problem of retrieving similar shapes from a large medical database of tumor shapes (‘find tumors that are similar to a given pattern’). We used a natural similarity function for shape matching based on state-of-the-art concepts from Mathematical Morphology, and showed how the function can be lower bounded by a set of features extracted from the shapes, thus leading to “correct” output (i.e., no false dismissals), a key requirement for medical applications. These features can be organized in a spatial access method, leading to fast indexing for range queries (‘Find objects within distance e of the given object.’) and nearest neighbor queries (‘Find the first k closest objects to the query object.’). In addition to the lower-bounding, our second contribution is the design of a fast algorithm for nearest neighbor search, which achieves significant speedup while provably guaranteeing correctness. Our experiments demonstrate up to 27 times better performance for the proposed method compared to sequential scanning. We also verified that the similarity function matches human perception of shape similarity, with experiments on human subjects obtaining 80% precision for up to 100% recall.

[1]  M. Eden A Two-dimensional Growth Process , 1961 .

[2]  G. Matheron Éléments pour une théorie des milieux poreux , 1967 .

[3]  A. ROSENFELD,et al.  Distance functions on digital pictures , 1968, Pattern Recognit..

[4]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[5]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[6]  Theodosios Pavlidis,et al.  Algorithms for Shape Analysis of Contours and Waveforms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[8]  Irene Gargantini,et al.  An effective way to represent quadtrees , 1982, CACM.

[9]  Klaus H. Hinrichs,et al.  The Grid File: A Data Structure to Support Proximity Queries on Spatial Objects , 1983, International Workshop on Graph-Theoretic Concepts in Computer Science.

[10]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[11]  Jack A. Orenstein Spatial query processing in an object-oriented database system , 1986, SIGMOD '86.

[12]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[13]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[14]  Edward R. Dougherty,et al.  Morphological methods in image and signal processing , 1988 .

[15]  Anastasios N. Venetsanopoulos,et al.  Morphological skeleton representation and shape recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[16]  Petros Maragos,et al.  Pattern Spectrum and Multiscale Shape Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Christos Faloutsos,et al.  Fractals for secondary key retrieval , 1989, PODS.

[18]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[19]  H. V. Jagadish Spatial search with polyhedra , 1990, [1990] Proceedings. Sixth International Conference on Data Engineering.

[20]  H. V. Jagadish,et al.  Linear clustering of objects with multiple attributes , 1990, SIGMOD '90.

[21]  A. Venetsanopoulos,et al.  The classification properties of the pecstrum and its use for pattern identification , 1991 .

[22]  H. V. Jagadish,et al.  A retrieval technique for similar shapes , 1991, SIGMOD '91.

[23]  Rajiv Mehrotra,et al.  Shape-similarity-based retrieval in image database systems , 1992, Electronic Imaging.

[24]  Anastasios N. Venetsanopoulos,et al.  Rotationally invariant pecstrum: A rotationally invariant object descriptor based on mathematical morphology , 1992 .

[25]  Ramesh C. Jain,et al.  A Visual Information Management System for the Interactive Retrieval of Faces , 1993, IEEE Trans. Knowl. Data Eng..

[26]  Harold G. Longbotham,et al.  Nonlinear indicators of malignancy , 1993, Electronic Imaging.

[27]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[28]  Arnold W. M. Smeulders,et al.  Towards a Morphological Scale-Space Theory , 1994 .

[29]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[30]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[31]  Malur K. Sundareshan,et al.  Adaptive image contrast enhancement based on human visual properties , 1994, IEEE Trans. Medical Imaging.

[32]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[33]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.

[34]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[35]  D. K. Harmon,et al.  Overview of the Third Text Retrieval Conference (TREC-3) , 1996 .

[36]  Christos Faloutsos,et al.  Fast and effective similarity search in medical tumor databases using morphology , 1996, Other Conferences.

[37]  Christos Faloutsos,et al.  Fast Nearest Neighbor Search in Medical Image Databases , 1996, VLDB.