Scalable parallelization of skyline computation for multi-core processors

The skyline is an important query operator for multi-criteria decision making. It reduces a dataset to only those points that offer optimal trade-offs of dimensions. In general, it is very expensive to compute. Recently, multicore CPU algorithms have been proposed to accelerate the computation of the skyline. However, they do not sufficiently minimize dominance tests and so are not competitive with state-of-the-art sequential algorithms. In this paper, we introduce a novel multicore skyline algorithm, Hybrid, which processes points in blocks. It maintains a shared, global skyline among all threads, which is used to minimize dominance tests while maintaining high throughput. The algorithm uses an efficiently-updatable data structure over the shared, global skyline, based on point-based partitioning. Also, we release a large benchmark of optimized skyline algorithms, with which we demonstrate on challenging workloads a 100-fold speedup over state-of-the-art multicore algorithms and a 10-fold speedup with 16 cores over state-of-the-art sequential algorithms.

[1]  Hua Lu,et al.  Efficient Skyline Computation in MapReduce , 2014, EDBT.

[2]  Seung-won Hwang,et al.  VSkyline: vectorization for efficient skyline computation , 2010, SGMD.

[3]  Alex Thomo,et al.  Computing k-Regret Minimizing Sets , 2014, Proc. VLDB Endow..

[4]  Jonghyun Park,et al.  Parallel Skyline Computation on Multicore Architectures , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[5]  Katja Hose,et al.  A survey of skyline processing in highly distributed environments , 2011, The VLDB Journal.

[6]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[7]  Ilaria Bartolini,et al.  Efficient sort-based skyline evaluation , 2008, TODS.

[8]  Thomas Risse,et al.  Selecting skyline services for QoS-based web service composition , 2010, WWW '10.

[9]  Kyuseok Shim,et al.  Parallel Computation of Skyline and Reverse Skyline Queries Using MapReduce , 2013, Proc. VLDB Endow..

[10]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[11]  Sean Chester,et al.  On the Suitability of Skyline Queries for Data Exploration , 2014, EDBT/ICDT Workshops.

[12]  Fang Wei-Kleiner,et al.  Evaluation of skyline algorithms in PostgreSQL , 2009, IDEAS '09.

[13]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[14]  Ling Liu,et al.  Multi-criteria decision making with skyline computation , 2012, 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI).

[15]  Seung-won Hwang,et al.  Scalable skyline computation using a balanced pivot selection technique , 2014, Inf. Syst..

[16]  Shuigeng Zhou,et al.  Adapting Skyline Computation to the MapReduce Framework: Algorithms and Experiments , 2011, DASFAA Workshops.

[17]  Nikos Mamoulis,et al.  Scalable skyline computation using object-based space partitioning , 2009, SIGMOD Conference.

[18]  Jarek Gryz,et al.  Algorithms and analyses for maximal vector computation , 2007, The VLDB Journal.

[19]  Ira Assent,et al.  Efficient GPU-based skyline computation , 2013, DaMoN '13.

[20]  Christian S. Jensen,et al.  Stochastic skyline route planning under time-varying uncertainty , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[21]  Christos Doulkeridis,et al.  APSkyline: Improved Skyline Computation for Multicore Architectures , 2014, DASFAA.

[22]  Hans-Peter Kriegel,et al.  Route skyline queries: A multi-preference path planning approach , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[23]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.