An integer programming approach for the view and index selection problem

The view- and index-selection problem is a combinatorial optimization problem that arises in the context of on-line analytical processing (OLAP) in database-management systems. We propose an integer programming (IP) model for this problem and study the properties of the views and indexes that appear in the optimal solution for this model. We then use these properties to remove a number of variables and constraints from the corresponding IP model and obtain a model that is significantly smaller, yet its optimal solution is guaranteed to be optimal for the original problem. This allows us to solve realistic-size instances of the problem in reasonable time using commercial IP solvers. Subsequently, we propose heuristic strategies to further reduce the size of this IP model and dramatically reduce its execution time, although we no longer guarantee that the reduced IP model offers a globally optimal solution for the original problem. Finally, we carry out an extensive computational study to evaluate the effectiveness of these IP models for solving the OLAP view- and index-selection problem.

[1]  Kamalakar Karlapalem,et al.  On efficient storage space distribution among materialized views and indices in data warehousing environments , 2000, CIKM '00.

[2]  Dimitri Theodoratos,et al.  Processing OLAP queries in hierarchically clustered databases , 2003, Data Knowl. Eng..

[3]  Surajit Chaudhuri,et al.  An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server , 1997, VLDB.

[4]  Alberto Abelló,et al.  Data warehousing and OLAP (DOLAP'08) , 2010, Data Knowl. Eng..

[5]  Jennifer Widom,et al.  Database Systems: The Complete Book , 2001 .

[6]  Howard J. Karloff,et al.  On the complexity of the view-selection problem , 1999, PODS '99.

[7]  Jozef Kratica,et al.  A Genetic Algorithm for the Index Selection Problem , 2003, EvoWorkshops.

[8]  Panos Kalnis,et al.  View selection using randomized search , 2002, Data Knowl. Eng..

[9]  Rik Eshuis,et al.  An integer programming based approach for verification and diagnosis of workflows , 2010, Data Knowl. Eng..

[10]  Hamid Pirahesh,et al.  Recommending materialized views and indexes with the IBM DB2 design advisor , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[11]  Jeffrey D. Ullman,et al.  Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[12]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[13]  Stanley B. Zdonik,et al.  CORADD: Correlation Aware Database Designer for Materialized Views and Indexes , 2010, Proc. VLDB Endow..

[14]  Clifford M. Broughton IBM DB2 Cube Views and DB2 Materialized Query Tables in a SAS ® Environment , 2005 .

[15]  Surajit Chaudhuri,et al.  An overview of business intelligence technology , 2011, Commun. ACM.

[16]  Matteo Golfarelli,et al.  Materialization of fragmented views in multidimensional databases , 2004, Data Knowl. Eng..

[17]  Rada Chirkova,et al.  A Formal Model for the Problem of View Selection for Aggregate Queries , 2005, ADBIS.

[18]  Sam Lightstone,et al.  Physical Database Design for Relational Databases , 2009, Encyclopedia of Database Systems.

[19]  Rong Huang,et al.  Deterministic View Selection for Data-Analysis Queries: Properties and Algorithms , 2012, ADBIS.

[20]  Hamid Pirahesh,et al.  A Framework for Using Materialized XPath Views in XML Query Processing , 2004, VLDB.

[21]  Ralph Kimball,et al.  The Data Warehouse Lifecycle Toolkit , 2009 .

[22]  Surajit Chaudhuri,et al.  Interactive physical design tuning , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[23]  Matteo Fischetti,et al.  Exact and Approximate Algorithms for the Index Selection Problem in Physical Database Design , 1995, IEEE Trans. Knowl. Data Eng..

[24]  Christie I. Ezeife A uniform approach for selecting views and indexes in a data warehouse , 1997, Proceedings of the 1997 International Database Engineering and Applications Symposium (Cat. No.97TB100166).

[25]  Jian Yang,et al.  Algorithms for Materialized View Design in Data Warehousing Environment , 1997, VLDB.

[26]  Rada Chirkova,et al.  Exact and inexact methods for solving the view and index selection problem for olap performance improvement , 2010 .

[27]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[28]  G. Nemhauser,et al.  Integer Programming , 2020 .

[29]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[30]  Daniel C. Zilio,et al.  DB2 advisor: an optimizer smart enough to recommend its own indexes , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[31]  B. Barnes,et al.  Review: The Data Warehouse Toolkit (Second Edition) , 2003 .

[32]  Gerhard Weikum,et al.  Database Tuning using Combinatorial Search , 2009, Encyclopedia of Database Systems.

[33]  Elena Baralis,et al.  Materialized Views Selection in a Multidimensional Database , 1997, VLDB.

[34]  Rada Chirkova,et al.  Exact and inexact methods for selecting views and indexes for OLAP performance improvement , 2008, EDBT '08.

[35]  Rada Chirkova,et al.  Exact and inexact methods for solving the problem of view selection for aggregate queries , 2009, Int. J. Bus. Intell. Data Min..

[36]  Carleton Coffrin,et al.  Optimizing index deployment order for evolving OLAP , 2012, EDBT '12.

[37]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988 .

[38]  Surajit Chaudhuri,et al.  3 Self-Tuning Histograms : Exploiting Execution Feedback , 2006 .

[39]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[40]  Surajit Chaudhuri,et al.  Index selection for databases: a hardness study and a principled heuristic solution , 2004, IEEE Transactions on Knowledge and Data Engineering.

[41]  Hamid Pirahesh,et al.  Recommending materialized views and indexes with the IBM DB2 design advisor , 2004 .