Materialized Views Selection for Answering Queries

A data warehouse stores historical data to support analytical query processing. These analytical queries are long and complex and processing these against a large data warehouse consumes a lot of time. As a result, the query response time is high. One way to reduce this time is by selecting views that are likely to answer a large number of future queries and storing them in a data warehouse. This problem is referred to as view selection. Several view selection algorithms have been proposed with most of these being focused around HRUA. HRUA considers the size of the views to select the most beneficial view for materialization. The views selected using HRUA, though beneficial with respect to size, may be unable to account for large numbers of queries and thus making them an unnecessary overhead. The algorithm proposed in this paper attempts to address this problem by considering query frequency, along with the size, of the view to select Top-K views for materialization. The proposed algorithm, in each iteration, computes the profit, defined in terms of size and query frequency, and then selects the most profitable view for materialization. As a result, the views selected are beneficial with respect to size and have the ability to answer future queries. Further, experimental results show that the proposed algorithm, in comparison to HRUA, is able to select views capable of answering larger number of queries against a slight increase in the total cost of evaluating all the views. This in turn would result in efficient decision making.

[1]  Xin Yao,et al.  An evolutionary approach to materialized views selection in a data warehouse environment , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[2]  Jérôme Darmont,et al.  Data mining-based materialized view and index selection in data warehouses , 2007, Journal of Intelligent Information Systems.

[3]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize in a Data Warehouse , 2005, IEEE Trans. Knowl. Data Eng..

[4]  W. H. Inmon,et al.  Building the Data Warehouse,3rd Edition , 2002 .

[5]  Elena Baralis,et al.  Materialized Views Selection in a Multidimensional Database , 1997, VLDB.

[6]  Toby J. Teorey,et al.  A progressive view materialization algorithm , 1999, DOLAP '99.

[7]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[8]  T. V. Vijay Kumar,et al.  Proposing Candidate Views for Materialization , 2010, ICISTM.

[9]  Wolfgang Lehner,et al.  Improving query response time in scientific databases using data aggregation -a case study , 1996, Proceedings of 7th International Conference and Workshop on Database and Expert Systems Applications: DEXA 96.

[10]  T. V. Vijay Kumar,et al.  A Reduced Lattice Greedy Algorithm for Selecting Materialized Views , 2009, ICISTM.

[11]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[12]  Jeffrey D. Ullman,et al.  Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[13]  Toby J. Teorey,et al.  Achieving scalability in OLAP materialized view selection , 2002, DOLAP '02.

[14]  Jérôme Darmont,et al.  Clustering-Based Materialized View Selection in Data Warehouses , 2006, ADBIS.

[15]  W. H. Inmon,et al.  Building the data warehouse , 1992 .

[16]  M. T. Serna-Encinas,et al.  Algorithm for selection of materialized views: based on a costs model , 2007 .

[17]  Nick Roussopoulos,et al.  Materialized views and data warehouses , 1998, SGMD.

[18]  Karthik Ramachandran,et al.  A Hybrid Approach for Data Warehouse View Selection , 2006, Int. J. Data Warehous. Min..

[19]  Dimitri Theodoratos,et al.  A general framework for the view selection problem for data warehouse design and evolution , 2000, DOLAP '00.

[20]  Sartaj Sahni,et al.  Information Systems, Technology and Management - Third International Conference, ICISTM 2009, Ghaziabad, India, March 12-13, 2009. Proceedings , 2009, ICISTM.

[21]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.