A fast and progressive algorithm for skyline queries with totally- and partially-ordered domains

We devise a skyline algorithm that can efficiently mitigate the enormous overhead of processing millions of tuples on totally- and partially-ordered domains (henceforth, TODs and PODs). With massive datasets, existing techniques spend a significant amount of time on a dominance comparison because of both a large number of skyline points and the unprogressive method of skyline computing with PODs. (If data has high dimensionality, the situation is undoubtedly aggravated.) The progressiveness property turns out to be the key feature for solving all remaining problems. This article presents a FAST-SKY algorithm that deals successfully with these two obstacles and improves skyline query processing time strikingly, even with high-dimensional data. Progressive skyline evaluation with PODs is guaranteed by new index structures and topological sorting order. A stratification technique is adopted to index data on PODs, and we propose two new index structures: stratified R-trees (SR-trees) for low-dimensional data and stratified MinMax treaps (SM-treaps) for high-dimensional data. A fast dominance comparison is achieved by using a reporting query instead of a dominance query, and a dimensionality reduction technique. Experimental results suggest that in general cases (anti-correlated and uniform distributions) FAST-SKY is orders of magnitude faster than existing algorithms.

[1]  Werner Kießling,et al.  Foundations of Preferences in Database Systems , 2002, VLDB.

[2]  Kian-Lee Tan,et al.  Stratified computation of skylines with partially-ordered domains , 2005, SIGMOD '05.

[3]  Beng Chin Ooi,et al.  Querying high-dimensional data in single-dimensional space , 2004, The VLDB Journal.

[4]  Ken C. K. Lee,et al.  Approaching the Skyline in Z Order , 2007, VLDB.

[5]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[6]  Edward M. McCreight,et al.  Priority Search Trees , 1985, SIAM J. Comput..

[7]  Stavros Papadopoulos,et al.  Topologically Sorted Skylines for Partially Ordered Domains , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[8]  Anthony K. H. Tung,et al.  Efficient Skyline Query Processing on Peer-to-Peer Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[10]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[11]  Jignesh M. Patel,et al.  Efficient Skyline Computation over Low-Cardinality Domains , 2007, VLDB.

[12]  Yoram Singer,et al.  Learning to Order Things , 1997, NIPS.

[13]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[14]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[15]  AgrawalRakesh,et al.  A framework for expressing and combining preferences , 2000 .

[16]  Bernard Chazelle,et al.  Lower bounds for orthogonal range searching: I. The reporting case , 1990, JACM.

[17]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[18]  Rakesh Agrawal,et al.  A framework for expressing and combining preferences , 2000, SIGMOD '00.

[19]  Hongjun Lu,et al.  Stabbing the sky: efficient skyline computation over sliding windows , 2005, 21st International Conference on Data Engineering (ICDE'05).

[20]  Christos Doulkeridis,et al.  Angle-based space partitioning for efficient parallel skyline computation , 2008, SIGMOD Conference.

[21]  Bernard Chazelle,et al.  Lower bounds for orthogonal range searching: part II. The arithmetic model , 1990, JACM.

[22]  Qing Liu,et al.  Towards multidimensional subspace skyline analysis , 2006, TODS.

[23]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[24]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[25]  Werner Kießling,et al.  Preference SQL - Design, Implementation, Experiences , 2002, VLDB.

[26]  Xiang Lian,et al.  Monochromatic and bichromatic reverse skyline search over uncertain databases , 2008, SIGMOD Conference.