Parallel Skyline Computation on Multicore Architectures

With the advent of multicore processors,it has become imperative to write parallel programs if one wishes to exploit the next generation of processors. This paper deals with skyline computation as a case study of parallelizing database operations on multicore architectures. We compare two parallel skyline algorithms: a parallel version of the branch-and-bound algorithm (BBS) and a new parallel algorithm based on skeletal parallel programming. Experimental results show despite its simple design, the new parallel algorithm is comparable to parallel BBS in speed. For sequential skyline computation, the new algorithm far outperforms sequential BBS when the density of skyline tuples is low.

[1]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[2]  Volker Markl,et al.  Parallelizing query optimization , 2008, Proc. VLDB Endow..

[3]  Jonghyun Park,et al.  Parallel Skyline Computation on Multicore Architectures , 2009, ICDE.

[4]  Pradeep Dubey,et al.  Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs , 2009, Proc. VLDB Endow..

[5]  Peyton Jones,et al.  Haskell 98 language and libraries : the revised report , 2003 .

[6]  James R. Larus,et al.  Software and the Concurrency Revolution , 2005, ACM Queue.

[7]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[8]  Setsuo Ohsuga,et al.  INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES , 1977 .

[9]  J. Davenport Editor , 1960 .

[10]  Christos Doulkeridis,et al.  Angle-based space partitioning for efficient parallel skyline computation , 2008, SIGMOD Conference.

[11]  Nikos Mamoulis,et al.  Scalable skyline computation using object-based space partitioning , 2009, SIGMOD Conference.

[12]  Ken C. K. Lee,et al.  Approaching the Skyline in Z Order , 2007, VLDB.

[13]  Ilaria Bartolini,et al.  Efficient sort-based skyline evaluation , 2008, TODS.

[14]  Eric Li,et al.  Optimization of Frequent Itemset Mining on Multiple-Core Processor , 2007, VLDB.

[15]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[16]  Jinsoo Lee,et al.  Dependency-aware reordering for parallelizing query optimization in multi-core CPUs , 2009, SIGMOD Conference.

[17]  Douglas Stott Parker,et al.  Map-reduce-merge: simplified relational data processing on large clusters , 2007, SIGMOD '07.

[18]  Ivan Stojmenovic,et al.  An optimal parallel algorithm for solving the maximal elements problem in the plane , 1988, Parallel Comput..

[19]  Ben Y. Zhao,et al.  Parallelizing Skyline Queries for Scalable Distribution , 2006, EDBT.

[20]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[21]  Riccardo Torlone,et al.  Finding the Best when it's a Matter of Preference , 2002, SEBD.

[22]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[23]  Volker Markl,et al.  Integrating the UB-Tree into a Database System Kernel , 2000, VLDB.

[24]  Jian Pei,et al.  Efficient Skyline and Top-k Retrieval in Subspaces , 2007, IEEE Transactions on Knowledge and Data Engineering.

[25]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[26]  Beatrice Yormark,et al.  Proceedings of the 1984 ACM SIGMOD international conference on Management of data , 1977 .

[27]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[28]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[29]  Wolf-Tilo Balke,et al.  Highly Scalable Multiprocessing Algorithms for Preference-Based Database Retrieval , 2010, DASFAA.

[30]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[31]  Shyam Antony,et al.  Thread Cooperation in Multicore Architectures for Frequency Counting over Multiple Data Streams , 2009, Proc. VLDB Endow..

[32]  Andrew Rau-Chaplin,et al.  Scalable parallel geometric algorithms for coarse grained multicomputers , 1993, SCG '93.

[33]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[34]  Julius T. Tou,et al.  Information Systems , 1973, GI Jahrestagung.

[35]  Jirí Matousek,et al.  Computing Dominances in E^n , 1991, Inf. Process. Lett..

[36]  Vagelis Hristidis,et al.  Authority-based keyword search in databases , 2008, TODS.

[37]  D. Walker,et al.  Patterns and Skeletons for Parallel and Distributed Computing , 2022 .

[38]  Jian Pei,et al.  SUBSKY: Efficient Computation of Skylines in Subspaces , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[39]  Anthony K. H. Tung,et al.  Efficient Skyline Query Processing on Peer-to-Peer Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[40]  Norbert Zeh,et al.  Parallel Computation of Skyline Queries , 2007, 21st International Symposium on High Performance Computing Systems and Applications (HPCS'07).

[41]  Shirish Tatikonda,et al.  Mining Tree-Structured Data on Multicore Systems , 2009, Proc. VLDB Endow..