How reduce the View Selection Problem through the CoDe Modeling

Big Data visualization is not an easy task due to the sheer amount of information contained in data warehouses. Then the accuracy on data relationships in a representation becomes one of the most crucial aspects to perform business knowledge discovery. A tool that allows to model and visualize information relationships between data is CoDe, which by processing several queries on a data-mart, generates a visualization of such data. However on a large data warehouse, the computation of these queries increases the response time by the query complexity. A common approach to speed up data warehousing is precompute a set of materialized views, store in the warehouse and use them to compute the workload queries. The goal and the objectives of this paper are to present a new process exploiting the CoDe modeling through determining the minimal number of required OLAP queries and to mitigate the problem of view selection, i.e., select the optimal set of materialized views. In particular, the proposed process determines the minimal number of required OLAP queries, creates an ad hoc lattice structure to represent them, and selects on such structure the views to be materialized taking into account an heuristic based on the processing time cost and the view storage space. The results of an experiment on a real data warehouse show an improvement in the range of 36-98% with respect the approach that does not consider materialized views, and 7% wrt. an approach that exploits them. Moreover, we have shown how the results are affected by the lattice structure.

[1]  Surajit Chaudhuri,et al.  An overview of data warehousing and OLAP technology , 1997, SGMD.

[2]  Jennifer Widom,et al.  Research problems in data warehousing , 1995, CIKM '95.

[3]  Karthik Ramachandran,et al.  A Hybrid Approach for Data Warehouse View Selection , 2006, Int. J. Data Warehous. Min..

[4]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[5]  Ling Feng,et al.  Optimized Design of Materialized Views in a Real-Life Data Warehousing Environment , 2001 .

[6]  Jeffrey F. Naughton,et al.  Materialized View Selection for Multidimensional Datasets , 1998, VLDB.

[7]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize in a Data Warehouse , 2005, IEEE Trans. Knowl. Data Eng..

[8]  An Gong,et al.  Clustering-Based Dynamic Materialized View Selection Algorithm , 2008, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery.

[9]  Howard J. Karloff,et al.  On the complexity of the view-selection problem , 1999, PODS '99.

[10]  Chandrashekhar A. Dhote,et al.  Dynamic Materialized View Selection Algorithm: A Clustering Approach , 2010, ICDEM.

[11]  Michele Risi,et al.  Exploiting CoDe modeling for the optimization of OLAP queries , 2016, 2016 Eleventh International Conference on Digital Information Management (ICDIM).

[12]  Myoung-Ho Kim,et al.  Rewriting OLAP queries using materialized views and dimension hierarchies in data warehouses , 2001, Proceedings 17th International Conference on Data Engineering.

[13]  Jian Yang,et al.  Algorithms for Materialized View Design in Data Warehousing Environment , 1997, VLDB.

[14]  Wolfgang Lehner,et al.  Processing reporting function views in a data warehouse environment , 2002, Proceedings 18th International Conference on Data Engineering.

[15]  Xin Yao,et al.  An evolutionary approach to materialized views selection in a data warehouse environment , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[16]  Elena Baralis,et al.  Materialized Views Selection in a Multidimensional Database , 1997, VLDB.

[17]  Pat Hanrahan,et al.  Show Me: Automatic Presentation for Visual Analysis , 2007, IEEE Transactions on Visualization and Computer Graphics.

[18]  Zohra Bellahsene,et al.  A survey of view selection methods , 2012, SGMD.

[19]  Michele Risi,et al.  CoDe Modeling of Graph Composition for Data Warehouse Report Visualization , 2014, IEEE Transactions on Knowledge and Data Engineering.

[20]  Antti Valmari,et al.  The State Explosion Problem , 1996, Petri Nets.

[21]  Pat Hanrahan,et al.  Polaris: a system for query, analysis and visualization of multi-dimensional relational databases , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[22]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[23]  Michele Risi,et al.  Visualizing Information in Data Warehouses Reports , 2011, SEBD.

[24]  W. Hays Semiology of Graphics: Diagrams Networks Maps. , 1985 .

[25]  Wolfgang Lehner,et al.  Materialized Views in the Presence of Reporting Functions , 2006, 18th International Conference on Scientific and Statistical Database Management (SSDBM'06).