Speeding Up Materialized View Selection in Data Warehouses Using a Randomized Algorithm

A data warehouse stores information that is collected from multiple, heterogeneous information sources for the purpose of complex querying and analysis. Information in the warehouse is typically stored in the form of materialized views, which represent pre-computed portions of frequently asked queries. One of the most important tasks when designing a warehouse is the selection of materialized views to be maintained in the warehouse. The goal is to select a set of views in such a way as to minimize the total query response time over all queries, given a limited amount of time for maintaining the views (maintenance-cost view selection problem). In this paper, we propose an efficient solution to the maintenance-cost view selection problem using a genetic algorithm for computing a near-optimal set of views. Specifically, we explore the maintenance-cost view selection problem in the context of OR view graphs. We show that our approach represents a dramatic improvement in time complexity over existing search-ba...

[1]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[2]  V. S. Subrahmanian,et al.  Maintaining views incrementally , 1993, SIGMOD Conference.

[3]  Michael E. Wall,et al.  Galib: a c++ library of genetic algorithm components , 1996 .

[4]  Jennifer Widom,et al.  Making views self-maintainable for data warehousing , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[5]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[6]  Arun N. Swami,et al.  Optimization of large join queries: combining heuristics and combinatorial techniques , 1989, SIGMOD '89.

[7]  Wilburt Labio,et al.  Physical database design for data warehouses , 1997, Proceedings 13th International Conference on Data Engineering.

[8]  Nick Roussopoulos,et al.  View indexing in relational databases , 1982, TODS.

[9]  Zbigniew Michalewicz,et al.  Genetic Algorithms + Data Structures = Evolution Programs , 1996, Springer Berlin Heidelberg.

[10]  Nick Roussopoulos,et al.  Materialized views and data warehouses , 1998, SGMD.

[11]  Timos K. Sellis,et al.  Data Warehouse Configuration , 1997, VLDB.

[12]  Kenneth A. Ross,et al.  Materialized view maintenance and integrity constraint checking: trading space for time , 1996, SIGMOD '96.

[13]  Kyuseok Shim,et al.  Optimizing queries with materialized views , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[14]  Eric N. Hanson,et al.  A performance comparison of the Rete and TREAT algorithms for testing database rule conditions , 1992, [1992] Eighth International Conference on Data Engineering.

[15]  Jennifer Widom,et al.  Research problems in data warehousing , 1995, CIKM '95.

[16]  Marvin H. Solomon,et al.  The GMAP: a versatile tool for physical data independence , 1996, The VLDB Journal.

[17]  John Beidler,et al.  Data Structures and Algorithms , 1996, Wiley Encyclopedia of Computer Science and Engineering.

[18]  Jennifer Widom,et al.  View maintenance in a warehousing environment , 1995, SIGMOD '95.

[19]  Nils J. Nilsson,et al.  Problem-solving methods in artificial intelligence , 1971, McGraw-Hill computer science series.

[20]  Frank Wm. Tompa,et al.  Efficiently updating materialized views , 1986, SIGMOD '86.

[21]  Surajit Chaudhuri,et al.  Maintenance of Materialized Views: Problems, Techniques, and Applications. , 1995 .

[22]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize Under a Maintenance Cost Constraint , 1999, ICDT.

[23]  Qing Li,et al.  Design and selection of materialized views in a data warehousing environment: a case study , 1999, DOLAP '99.

[24]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[25]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[26]  Ehl Emile Aarts,et al.  Simulated annealing and Boltzmann machines , 2003 .

[27]  Gilles Venturini,et al.  Learning First Order Logic Rules with a Genetic Algorithm , 1995, KDD.

[28]  Eric N. Hanson,et al.  Rule condition testing and action execution in Ariel , 1992, SIGMOD '92.

[29]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[30]  Per-Åke Larson,et al.  Computing Queries from Derived Relations , 1985, VLDB.

[31]  Jian Yang,et al.  Materialized view evolution support in data warehouse environment , 1999, Proceedings. 6th International Conference on Advanced Systems for Advanced Applications.

[32]  Nicholas J. Radcliffe,et al.  A Genetic Algorithm-Based Approach to Data Mining , 1996, KDD.

[33]  W. H. Inmon,et al.  Rdb/VMS: Developing the Data Warehouse , 1993 .