Parallelization of group‐based skyline computation for multi‐core processors

Skyline computation is particularly useful in multi‐criteria decision‐making applications. However, it is inadequate to answer queries that need to analyze not only individual points but also groups of points. Compared to the traditional skyline computation, computing group‐based skyline is much more complicated and expensive. This computational challenge promotes us to use modern computing platforms to accelerate the computation. In this paper, we introduce a novel multi‐core algorithm to compute group‐based skyline. We first compute the skyline layers of a data set in parallel, which are a critical intermediate result. In the algorithm, we maintain an efficiently updatable data structure for the shared global skyline layers, which is used to minimize dominance tests and maintain high throughput. Then we design an efficient parallel algorithm to find group‐based skyline based on the skyline layers. Extensive experimental results on real and synthetic data sets show that our algorithms achieve 10‐fold speedup with 16 parallel threads over state‐of‐the‐art sequential algorithms on challenging workloads.

[1]  Xuemin Lin,et al.  Skyline probability over uncertain preferences , 2013, EDBT '13.

[2]  Jian Pei,et al.  Finding Pareto Optimal Groups: Group-based Skyline , 2015, Proc. VLDB Endow..

[3]  Marlene Goncalves,et al.  Evaluating Top-k Skyline Queries over Relational Databases , 2007, DEXA.

[4]  Hyeonseung Im,et al.  Group skyline computation , 2012, Inf. Sci..

[5]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[6]  Seung-won Hwang,et al.  Personalized top-k skyline queries in high-dimensional space , 2009, Inf. Syst..

[7]  Hua Lu,et al.  Efficient Skyline Computation in MapReduce , 2014, EDBT.

[8]  Christos Doulkeridis,et al.  APSkyline: Improved Skyline Computation for Multicore Architectures , 2014, DASFAA.

[9]  Ken C. K. Lee,et al.  Approaching the Skyline in Z Order , 2007, VLDB.

[10]  Gautam Das,et al.  On Skyline Groups , 2012, IEEE Transactions on Knowledge and Data Engineering.

[11]  Ira Assent,et al.  Efficient GPU-based skyline computation , 2013, DaMoN '13.

[12]  Xiaowei Wang,et al.  GDPS: An Efficient Approach for Skyline Queries over Distributed Uncertain Data , 2014, Big Data Res..

[13]  Seung-won Hwang,et al.  BSkyTree: scalable skyline computation using a balanced pivot selection , 2010, EDBT '10.

[14]  Seung-won Hwang,et al.  VSkyline: vectorization for efficient skyline computation , 2010, SGMD.

[15]  Qiang Liu,et al.  Top-k Skyline Groups Queries , 2017, EDBT.

[16]  Ilaria Bartolini,et al.  Efficient sort-based skyline evaluation , 2008, TODS.

[17]  Nikos Mamoulis,et al.  Scalable skyline computation using object-based space partitioning , 2009, SIGMOD Conference.

[18]  Qing Liu,et al.  Efficient Computation of the Skyline Cube , 2005, VLDB.

[19]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[20]  Xiaofeng Xu,et al.  Faster output-sensitive skyline computation algorithm , 2014, Inf. Process. Lett..

[21]  Marlene Goncalves,et al.  Reaching the Top of the Skyline: An Efficient Indexed Algorithm for Top-k Skyline Queries , 2009, DEXA.

[22]  Ling Liu,et al.  Multi-criteria decision making with skyline computation , 2012, 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI).

[23]  Werner Kießling,et al.  Scalagon: An Efficient Skyline Algorithm for All Seasons , 2015, DASFAA.

[24]  Seung-won Hwang,et al.  Scalable skyline computation using a balanced pivot selection technique , 2014, Inf. Syst..

[25]  Shuigeng Zhou,et al.  Adapting Skyline Computation to the MapReduce Framework: Algorithms and Experiments , 2011, DASFAA Workshops.

[26]  Kyuseok Shim,et al.  Parallel Computation of Skyline and Reverse Skyline Queries Using MapReduce , 2013, Proc. VLDB Endow..

[27]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[28]  David G. Kirkpatrick,et al.  Output-size sensitive algorithms for finding maximal vectors , 1985, SCG '85.

[29]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.

[30]  Chiang Lee,et al.  Efficient computation of combinatorial skyline queries , 2013, Inf. Syst..

[31]  Hua Lu,et al.  Flexible and Efficient Resolution of Skyline Query Size Constraints , 2011, IEEE Transactions on Knowledge and Data Engineering.

[32]  Jonghyun Park,et al.  Parallel Skyline Computation on Multicore Architectures , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[33]  Xiaoling Li,et al.  Parallel skyline queries over uncertain data streams in cloud computing environments , 2014, Int. J. Web Grid Serv..

[34]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[35]  Sean Chester,et al.  Scalable parallelization of skyline computation for multi-core processors , 2015, 2015 IEEE 31st International Conference on Data Engineering.