Parallel Skyline Computation on Multicore Architectures

With the advent of multicore processors,it has become imperative to write parallel programs if one wishes to exploit the next generation of processors. This paper deals with skyline computation as a case study of parallelizing database operations on multicore architectures. We compare two parallel skyline algorithms: a parallel version of the branch-and-bound algorithm (BBS) and a new parallel algorithm based on skeletal parallel programming. Experimental results show despite its simple design, the new parallel algorithm is comparable to parallel BBS in speed. For sequential skyline computation, the new algorithm far outperforms sequential BBS when the density of skyline tuples is low.

[1]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[2]  Volker Markl,et al.  Parallelizing query optimization , 2008, Proc. VLDB Endow..

[3]  Peyton Jones,et al.  Haskell 98 language and libraries : the revised report , 2003 .

[4]  Christos Doulkeridis,et al.  Angle-based space partitioning for efficient parallel skyline computation , 2008, SIGMOD Conference.

[5]  Andrew Rau-Chaplin,et al.  Scalable parallel geometric algorithms for coarse grained multicomputers , 1993, SCG '93.

[6]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[7]  Norbert Zeh,et al.  Parallel Computation of Skyline Queries , 2007, 21st International Symposium on High Performance Computing Systems and Applications (HPCS'07).

[8]  Ken C. K. Lee,et al.  Approaching the Skyline in Z Order , 2007, VLDB.

[9]  Ilaria Bartolini,et al.  Efficient sort-based skyline evaluation , 2008, TODS.

[10]  Ivan Stojmenovic,et al.  An optimal parallel algorithm for solving the maximal elements problem in the plane , 1988, Parallel Comput..

[11]  Ben Y. Zhao,et al.  Parallelizing Skyline Queries for Scalable Distribution , 2006, EDBT.

[12]  Jinsoo Lee,et al.  Dependency-aware reordering for parallelizing query optimization in multi-core CPUs , 2009, SIGMOD Conference.

[13]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[14]  Riccardo Torlone,et al.  Finding the Best when it's a Matter of Preference , 2002, SEBD.

[15]  Pradeep Dubey,et al.  Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs , 2009, Proc. VLDB Endow..

[16]  Eric Li,et al.  Optimization of Frequent Itemset Mining on Multiple-Core Processor , 2007, VLDB.

[17]  Nikos Mamoulis,et al.  Scalable skyline computation using object-based space partitioning , 2009, SIGMOD Conference.

[18]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[19]  James R. Larus,et al.  Software and the Concurrency Revolution , 2005, ACM Queue.

[20]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[21]  Shirish Tatikonda,et al.  Mining Tree-Structured Data on Multicore Systems , 2009, Proc. VLDB Endow..

[22]  Wolf-Tilo Balke,et al.  Highly Scalable Multiprocessing Algorithms for Preference-Based Database Retrieval , 2010, DASFAA.

[23]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[24]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[25]  Volker Markl,et al.  Integrating the UB-Tree into a Database System Kernel , 2000, VLDB.

[26]  Douglas Stott Parker,et al.  Map-reduce-merge: simplified relational data processing on large clusters , 2007, SIGMOD '07.

[27]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[28]  Sergei Gorlatch,et al.  Patterns and Skeletons for Parallel and Distributed Computing , 2002, Springer London.

[29]  Shyam Antony,et al.  Thread Cooperation in Multicore Architectures for Frequency Counting over Multiple Data Streams , 2009, Proc. VLDB Endow..

[30]  Jirí Matousek,et al.  Computing Dominances in E^n , 1991, Inf. Process. Lett..

[31]  Jian Pei,et al.  Efficient Skyline and Top-k Retrieval in Subspaces , 2007, IEEE Transactions on Knowledge and Data Engineering.

[32]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[33]  Jian Pei,et al.  SUBSKY: Efficient Computation of Skylines in Subspaces , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[34]  Anthony K. H. Tung,et al.  Efficient Skyline Query Processing on Peer-to-Peer Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.