Skyline Processing on Distributed Vertical Decompositions

We assume a data set that is vertically decomposed among several servers, and a client that wishes to compute the skyline by obtaining the minimum number of points. Existing solutions for this problem are restricted to the case where each server maintains exactly one dimension. This paper proposes a general solution for vertical decompositions of arbitrary dimensionality. We first investigate some interesting problem characteristics regarding the pruning power of points. Then, we introduce vertical partition skyline (VPS), an algorithmic framework that includes two steps. Phase 1 searches for an anchor point Panc that dominates, and hence eliminates, a large number of records. Starting with Panc, Phase 2 constructs incrementally a pruning area using an interesting union-intersection property of dominance regions. Servers do not transmit points that fall within the pruning area in their local subspace. Our experiments confirm the effectiveness of the proposed methods under various settings.

[1]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[2]  Bernhard Seeger,et al.  Constrained subspace skyline computation , 2006, CIKM '06.

[3]  Jarek Gryz,et al.  Algorithms and analyses for maximal vector computation , 2007, The VLDB Journal.

[4]  M. Goodchild The national center for geographic information and analysis , 1990 .

[5]  Anthony K. H. Tung,et al.  Skyframe: a framework for skyline query processing in peer-to-peer systems , 2008, The VLDB Journal.

[6]  Jian Pei,et al.  SUBSKY: Efficient Computation of Skylines in Subspaces , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[7]  Ken C. K. Lee,et al.  Approaching the Skyline in Z Order , 2007, VLDB.

[8]  Ilaria Bartolini,et al.  Efficient sort-based skyline evaluation , 2008, TODS.

[9]  Christos Doulkeridis,et al.  SKYPEER: Efficient Subspace Skyline Computation over Distributed Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[11]  Man Lung Yiu,et al.  Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data , 2007, VLDB.

[12]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[13]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[14]  Christos Doulkeridis,et al.  Angle-based space partitioning for efficient parallel skyline computation , 2008, SIGMOD Conference.

[15]  Anthony K. H. Tung,et al.  Minimizing the communication cost for continuous skyline maintenance , 2009, SIGMOD Conference.

[16]  Hua Lu,et al.  Parallel Distributed Processing of Constrained Skyline Queries by Filtering , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Patrick Valduriez,et al.  Best Position Algorithms for Top-k Queries , 2007, VLDB.

[18]  Vijay Kumar,et al.  Broadcast protocols to support efficient retrieval from databases by mobile users , 1999, TODS.