Implementing OLAP Query Fragment Aggregation and Recombination for the OLAP Enabled Grid

In this paper we propose a new query processing method for the OLAP enabled grid, which blends sophisticated cache extraction techniques and data grid scheduling to efficiently satisfy OLAP queries in a distributed fashion. The heart of our approach is our query fragment aggregation and recombination (FAR) strategy that partitions OLAP queries into subqueries which can be effectively answered by retrieving and aggregating multiple fragments of cached data from nearby grid sources, or as a last resort, more remote backend data warehouses. We have implemented and experimentally evaluated our query processing method and found that our strategy reduces query time between 50% and 60% for practical user cache sizes and network parameters.

[1]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[2]  Fabrizio Silvestri,et al.  Scheduling High Performance Data Mining Tasks on a Data Grid Environment , 2002, Euro-Par.

[3]  Jeffrey F. Naughton,et al.  Aggregate Aware Caching for Multi-Dimensional Queries , 2000, EDBT.

[4]  Peter Thanisch,et al.  Applying Grid Technologies to XML Based OLAP Cube Construction , 2003, DMDW.

[5]  Nick Roussopoulos,et al.  A case for dynamic view management , 2001, ACM Trans. Database Syst..

[6]  DehneFrank,et al.  The cgmCUBE project , 2006 .

[7]  Beng Chin Ooi,et al.  An adaptive peer-to-peer network for distributed caching of OLAP results , 2002, SIGMOD '02.

[8]  Andrew Rau-Chaplin,et al.  The cgmCUBE project: Optimizing parallel data cube generation for ROLAP , 2006, Distributed and Parallel Databases.

[9]  Sang-Min Park,et al.  Chameleon: a resource scheduler in a data grid environment , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[10]  Michael Lawrence,et al.  Multiobjective genetic algorithms for materialized view selection in OLAP data warehouses , 2006, GECCO '06.

[11]  Jeffrey D. Ullman,et al.  Implementing data cubes efficiently , 1996, SIGMOD '96.

[12]  Andrew Rau-Chaplin,et al.  Parallel multi-dimensional ROLAP indexing , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[13]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[14]  Ivan Janciak,et al.  Knowledge Grid Support for Treatment of Traumatic Brain Injury Victims , 2003, ICCSA.

[15]  Ying Chen,et al.  Parallel ROLAP Data Cube Construction on Shared-Nothing Multiprocessors , 2004, Distributed and Parallel Databases.

[16]  Jeffrey F. Naughton,et al.  Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies , 1996, VLDB.

[17]  Andrew Rau-Chaplin,et al.  The OLAP-Enabled Grid: Model and Query Processing Algorithms , 2006, 20th International Symposium on High-Performance Computing in an Advanced Collaborative Environment (HPCS'06).