Efficient analysis in multimedia databases

The rapid progress of digital technology has led to a situation where computers have become ubiquitous tools. Now we can find them in almost every environment, be it industrial or even private. With ever increasing performance computers assumed more and more vital tasks in engineering, climate and environmental research, medicine and the content industry. Previously, these tasks could only be accomplished by spending enormous amounts of time and money. By using digital sensor devices, like earth observation satellites, genome sequencers or video cameras, the amount and complexity of data with a spatial or temporal relation has gown enormously. This has led to new challenges for the data analysis and requires the use of modern multimedia databases. This thesis aims at developing efficient techniques for the analysis of complex multimedia objects such as CAD data, time series and videos. It is assumed that the data is modeled by commonly used representations. For example CAD data is represented as a set of voxels, audio and video data is represented as multi-represented, multi-dimensional time series. The main part of this thesis focuses on finding efficient methods for collision queries of complex spatial objects. One way to speed up those queries is to employ a cost-based decompositioning, which uses interval groups to approximate a spatial object. For example, this technique can be used for the Digital Mock-Up (DMU) process, which helps engineers to ensure short product cycles. This thesis defines and discusses a new similarity measure for time series called threshold-similarity. Two time series are considered similar if they expose a similar behavior regarding the transgression of a given threshold value. Another part of the thesis is concerned with the efficient calculation of reverse k-nearest neighbor (RkNN) queries in general metric spaces using conservative and progressive approximations. The aim of such RkNN queries is to determine the impact of single objects on the whole database. At the end, the thesis deals with video retrieval and hierarchical genre classification of music using multiple representations. The practical relevance of the discussed genre classification approach is highlighted with a prototype tool that helps the user to organize large music collections. Both the efficiency and the effectiveness of the presented techniques are thoroughly analyzed. The benefits over traditional approaches are shown by evaluating the new methods on real-world test datasets.

[1]  Yannis Manolopoulos,et al.  Advanced Database Indexing , 1999, Advances in Database Systems.

[2]  Raymond T. Ng,et al.  Indexing spatio-temporal trajectories with Chebyshev polynomials , 2004, SIGMOD '04.

[3]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[4]  Hugh E. Williams,et al.  A compression scheme for large databases , 2000, Proceedings 11th Australasian Database Conference. ADC 2000 (Cat. No.PR00528).

[5]  Hans-Peter Kriegel,et al.  Similarity Search on Time Series Based on Threshold Queries , 2006, EDBT.

[6]  R. J. Alcock,et al.  Time-Series Similarity Queries Employing a Feature-Based Approach , 1999 .

[7]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[8]  Pierre Geurts,et al.  Pattern Extraction for Time Series Classification , 2001, PKDD.

[9]  R. Coifman,et al.  Local feature extraction and its applications using a library of bases , 1994 .

[10]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[11]  Alain Fournier,et al.  Triangulating Simple Polygons and Equivalent Problems , 1984, TOGS.

[12]  John R. Smith,et al.  Interactive search fusion methods for video database retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[13]  Klara Nahrstedt,et al.  Multimedia fundamentals. Volume 1 , 2002 .

[14]  M. Egenhofer,et al.  Topological Relations Between Regions in IR 2 and ZZ 2 * , 1993 .

[15]  Christos Faloutsos,et al.  Analysis of the Clustering Properties of the Hilbert Space-Filling Curve , 2001, IEEE Trans. Knowl. Data Eng..

[16]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[17]  Hans-Peter Kriegel,et al.  A Storage and Access Architecture for Efficient Query Processing in Spatial Database Systems , 1993, SSD.

[18]  Hans-Peter Kriegel,et al.  Interval Sequences: An Object-Relational Approach to Manage Spatial Data , 2001, SSTD.

[19]  Chris H. Q. Ding,et al.  Minimum redundancy feature selection from microarray gene expression data , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[20]  Hans-Peter Kriegel,et al.  Spatial Data Management for Virtual Product Development , 2003, Computer Science in Perspective.

[21]  Hans-Peter Kriegel,et al.  Optimal multi-step k-nearest neighbor search , 1998, SIGMOD '98.

[22]  Hans-Peter Kriegel,et al.  Der virtualle Prototyp: Datenbankunterstützung für CAD-Anwendungen , 2004, Datenbank-Spektrum.

[23]  Arie E. Kaufman,et al.  An Algorithm for 3D Scan-Conversion of Polygons , 1987, Eurographics.

[24]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[25]  Clement T. Yu,et al.  Techniques and Systems for Image and Video Retrieval , 1999, IEEE Trans. Knowl. Data Eng..

[26]  Christos Faloutsos,et al.  Fractals for secondary key retrieval , 1989, PODS.

[27]  Hans-Peter Kriegel,et al.  Managing Intervals Efficiently in Object-Relational Databases , 2000, VLDB.

[28]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[29]  Erik Jan Hultink,et al.  4 Product development performance: strategy, organization and management in the world auto industry☆ , 1994 .

[30]  Michael N. DeMers,et al.  Fundamentals of Geographic Information Systems , 1996 .

[31]  Hans-Peter Kriegel,et al.  Hierarchical Genre Classification for Large Music Collections , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[32]  Hans-Peter Kriegel,et al.  Spatial query processing for high resolutions , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..

[33]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[34]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[35]  Hans-Peter Kriegel,et al.  Object-relational management of complex geographical objects , 2004, GIS '04.

[36]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[37]  Samuel DeFazio,et al.  Extensible indexing: a framework for integrating domain-specific indexing schemes into Oracle8i , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[38]  Kenneth Levenberg A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .

[39]  A. M. Andrew,et al.  Another Efficient Algorithm for Convex Hulls in Two Dimensions , 1979, Inf. Process. Lett..

[40]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[41]  Eamonn J. Keogh,et al.  A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering , 2005, PAKDD.

[42]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[43]  S. Muthukrishnan,et al.  Influence sets based on reverse nearest neighbor queries , 2000, SIGMOD '00.

[44]  M. Stonebraker,et al.  The Sequoia 2000 Benchmark , 1993, SIGMOD Conference.

[45]  Jack A. Orenstein Redundancy in spatial databases , 1989, SIGMOD '89.

[46]  Hans-Peter Kriegel,et al.  Classification of Websites as Sets of Feature Vectors , 2004, Databases and Applications.

[47]  Tong Zhang,et al.  Semi-automatic approach for music classification , 2003, SPIE ITCom.

[48]  Elke Achtert,et al.  Efficient reverse k-nearest neighbor search in arbitrary metric spaces , 2006, SIGMOD Conference.

[49]  Ruud M. Bolle,et al.  Comparison of distance measures for video copy detection , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[50]  Volker Gaede,et al.  Optimal Redundancy in Spatial Database Systems , 1995, SSD.

[51]  Amit Singh,et al.  High dimensional reverse nearest neighbor queries , 2003, CIKM '03.

[52]  Hayit Greenspan,et al.  A Probabilistic Framework for Spatio-Temporal Video Representation & Indexing , 2002, ECCV.

[53]  Christos Faloutsos,et al.  Efficiently supporting ad hoc queries in large datasets of time sequences , 1997, SIGMOD '97.

[54]  Johann-Christoph Freytag,et al.  Implementing geospatial operations in an object-relational database system , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[55]  Hans-Peter Kriegel,et al.  Distributed Intersection Join of Complex Interval Sequences , 2005, DASFAA.

[56]  Eamonn J. Keogh,et al.  Derivative Dynamic Time Warping , 2001, SDM.

[57]  Marios Hadjieleftheriou,et al.  R-Trees - A Dynamic Index Structure for Spatial Searching , 2008, ACM SIGSPATIAL International Workshop on Advances in Geographic Information Systems.

[58]  Manfred Schroeder,et al.  Fractals, Chaos, Power Laws: Minutes From an Infinite Paradise , 1992 .

[59]  Yun Wang,et al.  High Level Indexing of User-Defined Types , 1999, VLDB.

[60]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[61]  H. V. Jagadzsh Linear Clustering of Objects with Multiple Attributes , 1998 .

[62]  Tomas Akenine-Möller,et al.  Real-time rendering , 1997 .

[63]  Jay Banerjee,et al.  Indexing medium-dimensionality data in Oracle , 1999, SIGMOD '99.

[64]  Hans-Peter Kriegel,et al.  A cost model for interval intersection queries on RI-trees , 2002, Proceedings 14th International Conference on Scientific and Statistical Database Management.

[65]  Hans-Peter Kriegel,et al.  Query Processing of Spatial Objects: Complexity versus Redundancy , 1993, SSD.

[66]  Divyakant Agrawal,et al.  Reverse Nearest Neighbor Queries for Dynamic Databases , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[67]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[68]  Sang Uk Lee,et al.  Efficient video indexing scheme for content-based retrieval , 1999, IEEE Trans. Circuits Syst. Video Technol..

[69]  Hans-Peter Kriegel,et al.  Database Support for Concurrent Digital Mock-up , 1998, PROLAMAT.

[70]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[71]  Claudia Bauzer Medeiros,et al.  Databases for GIS , 1994, SGMD.

[72]  Hans-Peter Kriegel,et al.  MUSCLE: Music Classification Engine with User Feedback , 2006, EDBT.

[73]  Milind R. Naphade,et al.  Multimodal pattern matching for audio-visual query and retrieval , 2001, IS&T/SPIE Electronic Imaging.

[74]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[75]  King-Ip Lin,et al.  An index structure for efficient reverse nearest neighbor queries , 2001, Proceedings 17th International Conference on Data Engineering.

[76]  Robert M. Haralick,et al.  Textural Features for Image Classification , 1973, IEEE Trans. Syst. Man Cybern..

[77]  Hans-Peter Kriegel,et al.  The Paradigm of Relational Indexing: a Survey , 2003, BTW.

[78]  Hans-Peter Kriegel,et al.  Effective similarity search in multimedia databases using multiple representations , 2006, 2006 12th International Multi-Media Modelling Conference.

[79]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[80]  Hans-Peter Kriegel,et al.  Comparison of approximations of complex objects used for approximation-based query processing in spatial database systems , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[81]  Hans-Peter Kriegel,et al.  Effective Decompositioning of Complex Spatial Objects into Intervals , 2004, Databases and Applications.

[82]  Avideh Zakhor,et al.  Efficient video similarity measurement with video signature , 2002, Proceedings. International Conference on Image Processing.

[83]  Hanan Samet,et al.  Ranking in Spatial Databases , 1995, SSD.

[84]  Yufei Tao,et al.  Reverse kNN Search in Arbitrary Dimensionality , 2004, VLDB.

[85]  Christian S. Jensen,et al.  Developing a DataBlade for a new index , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[86]  Heikki Mannila,et al.  Distance measures for point sets and their computation , 1997, Acta Informatica.

[87]  Nieves R. Brisaboa,et al.  Survey on Spatial Data Modelling Approaches , 2005, Spatial Databases.

[88]  Eamonn J. Keogh,et al.  UCR Time Series Data Mining Archive , 1983 .

[89]  Alessandro Lameiras Koerich,et al.  Automatic classification of audio data , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[90]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[91]  Sanjeev R. Kulkarni,et al.  A framework for measuring video similarity and its application to video query by example , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[92]  Hans-Peter Kriegel,et al.  Threshold Similarity Queries in Large Time Series Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[93]  Edward Y. Chang,et al.  Multimodal information fusion for video concept detection , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[94]  Timothy Kevin Johnson,et al.  A reformulation of Coombs' Theory of Unidimensional Unfolding by representing attitudes as intervals , 2004 .

[95]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[96]  Christos Faloutsos,et al.  Analysis of the n-Dimensional Quadtree Decomposition for Arbitrary Hyperectangles , 1997, IEEE Trans. Knowl. Data Eng..

[97]  Hans-Peter Kriegel,et al.  Using Support Vector Machines for Classifying Large Sets of Multi-Represented Objects , 2004, SDM.

[98]  Davide Roverso,et al.  Plant diagnostics by transient classification: The ALADDIN approach , 2002, Int. J. Intell. Syst..