Similarity search in multimedia databases

The research on multimedia databases involves different areas in Computer Science, such as computer graphics, databases, and information retrieval. There are many practical applications that benefit from this research, e.g., molecular biology, medicine, CAD/CAM, and geography. An important characteristic of these applications is the variety of data that should be supported, e.g., text, images (both still and moving), and audio. This implies that the development of a multimedia information system is considerably more complex than a traditional information system. An important research issue in the field of multimedia databases is the content-based retrieval of similar objects. Given a multimedia query object, the search for an exact match in a database is not meaningful in most applications, because the probability that two multimedia objects are identical is negligible (unless they are digital copies from the same source). For this reason, the development of efficient and effective similarity search techniques has become an important topic in the multimedia database research community. The goal of this advanced technology seminar is to provide an overview of the similarity search problem and to present the state-of-art techniques for performing efficient and effective similarity queries in multimedia databases. The seminar begins with an introduction and a motivation of multimedia databases. The two main approaches for describing multimedia objects (as elements in a metric space or in a vector space) are introduced, as well as a description of the ”Multimedia Content Description Interface” (MPEG)-7 standard. The efficiency issue is addressed for both metric and vector space approaches, describing the data structures and algorithms used to answer similarity queries. For the effectiveness issue, the seminar introduces some widely used retrieval performance measures. Several examples of techniques for particular multimedia applications (text, image, CAD, 3D objects, audio and video) are presented. The seminar outline is as follows:

[1]  Hans-Peter Kriegel,et al.  Techniques for Design and Implementation of Efficient Spatial Access Methods , 1988, VLDB.

[2]  Christos Faloutsos,et al.  Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension , 1994, PODS.

[3]  Mario A. López,et al.  STR: a simple and efficient algorithm for R-tree packing , 1997, Proceedings 13th International Conference on Data Engineering.

[4]  Peter N. Yianilos,et al.  Data structures and algorithms for nearest neighbor search in general metric spaces , 1993, SODA '93.

[5]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[6]  Bharat Bhargava,et al.  Advanced Database Systems , 1993, Lecture Notes in Computer Science.

[7]  Jan Rittinger,et al.  E-cient and Efiective Querying by Image Content , 2004 .

[8]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Karl Aberer,et al.  Efficient querying on genomic databases by using metric space indexing techniques , 1997, Database and Expert Systems Applications. 8th International Conference, DEXA '97. Proceedings.

[10]  Hans-Peter Kriegel,et al.  The pyramid-technique: towards breaking the curse of dimensionality , 1998, SIGMOD '98.

[11]  Hanan Samet,et al.  The Design and Analysis of Spatial Data Structures , 1989 .

[12]  Marco Patella,et al.  Bulk Loading the M-tree , 2001 .

[13]  Sergey Brin,et al.  Near Neighbor Search in Large Metric Spaces , 1995, VLDB.

[14]  Andreas Henrich,et al.  Extending a Spatial Access Structure to Support Additional Standard Attributes , 1995, SSD.

[15]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[16]  Timos K. Sellis,et al.  A model for the prediction of R-tree performance , 1996, PODS.

[17]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[18]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[19]  Jeffrey F. Naughton,et al.  Generalized Search Trees for Database Systems , 1995, VLDB.

[20]  Tzi-cker Chiueh,et al.  Content-Based Image Indexing , 1994, VLDB.

[21]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[22]  Pavel Zezula,et al.  Approximate similarity retrieval with M-trees , 1998, The VLDB Journal.

[23]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[24]  Chad Carson,et al.  Optimizing queries over multimedia repositories , 1996, SIGMOD '96.

[25]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[26]  Bernhard Seeger,et al.  A Generic Approach to Bulk Loading Multidimensional Index Structures , 1997, VLDB.

[27]  Marco Patella,et al.  A Query-sensitive Cost Model for Similarity Queries with M-tree , 1999, Australasian Database Conference.

[28]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[29]  Hans-Peter Kriegel,et al.  Incremental Clustering for Mining in a Data Warehousing Environment , 1998, VLDB.

[30]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[31]  K. Wakimoto,et al.  Efficient and Effective Querying by Image Content , 1994 .

[32]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[33]  Benoit B. Mandelbrot,et al.  Fractal Geometry of Nature , 1984 .

[34]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[35]  Ki-Joune Li,et al.  The Spatial Locality and a Spatial Indexing Method by Dynamic Clustering in Hypermap System , 1991, SSD.

[36]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[37]  Hugh E. Williams,et al.  Indexing Nucleotide Databases for Fast Query Evaluation , 1996, EDBT.

[38]  Ming-Ling Lo,et al.  Generating Seeded Trees from Data Sets , 1995, SSD.

[39]  Christos Faloutsos,et al.  On packing R-trees , 1993, CIKM '93.

[40]  RoussopoulosNick,et al.  Direct spatial search on pictorial databases using packed R-trees , 1985 .

[41]  Shih-Fu Chang,et al.  VisualSEEk: a fully automated content-based image query system , 1997, MULTIMEDIA '96.

[42]  Pavel Zezula,et al.  A cost model for similarity queries in metric spaces , 1998, PODS '98.

[43]  Jürg Nievergelt,et al.  The Grid File: An Adaptable, Symmetric Multikey File Structure , 1984, TODS.

[44]  Ming-Ling Lo,et al.  Spatial joins using seeded trees , 1994, SIGMOD '94.

[45]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[46]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[47]  Dariu M. Gavrila,et al.  R-Tree Index Optimization , 1994 .

[48]  Hans-Peter Kriegel,et al.  Efficient User-Adaptable Similarity Search in Large Multimedia Databases , 1997, VLDB.

[49]  Christian Böhm,et al.  A cost model for nearest neighbor search in high-dimensional data space , 1997, PODS.

[50]  Jon Louis Bentley,et al.  Quad trees a data structure for retrieval on composite keys , 1974, Acta Informatica.

[51]  Amarnath Gupta,et al.  Virage image search engine: an open framework for image management , 1996, Electronic Imaging.

[52]  Abraham Silberschatz,et al.  Strategic directions in database systems—breaking out of the box , 1996, CSUR.

[53]  Donna K. Harman,et al.  Relevance Feedback and Other Query Modification Techniques , 1992, Information retrieval (Boston).

[54]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[55]  Takeo Kanade,et al.  Computer recognition of human faces , 1980 .

[56]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[57]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[58]  Michael Stonebraker,et al.  Chabot: Retrieval from a Relational Database of Images , 1995, Computer.

[59]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Ramesh C. Jain,et al.  Similarity indexing with the SS-tree , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[61]  Nick Roussopoulos,et al.  Direct spatial search on pictorial databases using packed R-trees , 1985, SIGMOD Conference.

[62]  Ronald Fagin,et al.  Incorporating User Preferences in Multimedia Queries , 1997, ICDT.

[63]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[64]  Nasser Yazdani,et al.  Matching and indexing sequences of different lengths , 1997, CIKM '97.

[65]  M. F.,et al.  Bibliography , 1985, Experimental Gerontology.

[66]  Simone Santini,et al.  Similarity Matching , 1995, ACCV.

[67]  Yannis Manolopoulos,et al.  Performance of Nearest Neighbor Queries in R-Trees , 1997, ICDT.

[68]  George J. Klir,et al.  Fuzzy sets and fuzzy logic - theory and applications , 1995 .

[69]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[70]  Ramesh Jain,et al.  Infoscopes: Multimedia Information Systems , 1996 .

[71]  Timos K. Sellis,et al.  Topological relations in the world of minimum bounding rectangles: a study with R-trees , 1995, SIGMOD '95.

[72]  Gregory K. Wallace,et al.  The JPEG still picture compression standard , 1991, CACM.

[73]  Marvin B. Shapiro The choice of reference points in best-match file searching , 1977, CACM.

[74]  James C. French,et al.  Using the triangle inequality to reduce the number of comparisons required for similarity-based retrieval , 1996, Electronic Imaging.

[75]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[76]  Christos Faloutsos,et al.  Hilbert R-tree: An Improved R-tree using Fractals , 1994, VLDB.

[77]  Pavel Zezula,et al.  Processing Complex Similarity Queries with Distance-Based Access Methods , 1998, EDBT.

[78]  Z. Meral Özsoyoglu,et al.  Distance-based indexing for high-dimensional metric spaces , 1997, SIGMOD '97.

[79]  Y. Kaya,et al.  A BASIC STUDY ON HUMAN FACE RECOGNITION , 1972 .