Skyframe: a framework for skyline query processing in peer-to-peer systems

This paper looks at the processing of skyline queries on peer-to-peer (P2P) networks. We propose Skyframe, a framework for efficient skyline query processing in P2P systems, which addresses the challenges of quick response time, low network communication cost and query load balancing among peers. Skyframe consists of two querying methods: one is optimized for network communication while the other focuses on query response time. These methods are different in the way in which the query search space is defined. In particular, the first method uses a high dominating point that has a large dominating region to prune the search space to achieve a low cost in network communication. On the other hand, the second method relaxes the search space in order to allow parallel query processing to speed up query response. Skyframe achieves query load balancing by both query load conscious data space splitting/merging during the join/departure of nodes and dynamic load migration. We further show how to apply Skyframe to both the P2P systems supporting multi-dimensional indexing and the P2P systems supporting single-dimensional indexing. Finally, we have conducted extensive experiments on both real and synthetic data sets over two existing P2P systems: CAN (Ratnasamy in A scalable content-addressable network. In: Proceedings of SIGCOMM Conference, pp. 161–172, 2001) and BATON (Jagadish et al. in A balanced tree structure for peer-to-peer networks. In: Proceedings of VLDB Conference, pp. 661–672, 2005) to evaluate the effectiveness and scalability of Skyframe.

[1]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[2]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[3]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[4]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[5]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[6]  Artur Andrzejak,et al.  Scalable, efficient range queries for grid information services , 2002, Proceedings. Second International Conference on Peer-to-Peer Computing,.

[7]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[8]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[9]  Pedro A. Szekely,et al.  MAAN: A Multi-Attribute Addressable Network for Grid Information Services , 2003, Proceedings. First Latin American Web Congress.

[10]  Randolph Y. Wang,et al.  SkipIndex : Towards a Scalable Peer-to-Peer Index Service for High Dimensional Data , 2004 .

[11]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[12]  Srinivasan Seshan,et al.  Mercury: supporting scalable multi-attribute range queries , 2004, SIGCOMM '04.

[13]  Aoying Zhou,et al.  Adapting the Content Native Space for Load Balanced Indexing , 2004, DBISP2P.

[14]  Hector Garcia-Molina,et al.  One torus to rule them all: multi-dimensional queries in P2P systems , 2004, WebDB '04.

[15]  Hector Garcia-Molina,et al.  Online Balancing of Range-Partitioned Data with Applications to Peer-to-Peer Systems , 2004, VLDB.

[16]  Hongjun Lu,et al.  Stabbing the sky: efficient skyline computation over sliding windows , 2005, 21st International Conference on Data Engineering (ICDE'05).

[17]  Jian Pei,et al.  Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces , 2005, VLDB.

[18]  Kian-Lee Tan,et al.  Stratified computation of skylines with partially-ordered domains , 2005, SIGMOD '05.

[19]  Sriram Ramabhadran,et al.  A case study in building layered DHT applications , 2005, SIGCOMM '05.

[20]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[21]  Qing Liu,et al.  Efficient Computation of the Skyline Cube , 2005, VLDB.

[22]  Beng Chin Ooi,et al.  BATON: A Balanced Tree Structure for Peer-to-Peer Networks , 2005, VLDB.

[23]  Karl Aberer,et al.  Multifaceted Simultaneous Load Balancing in DHT-Based P2P Systems: A New Game with Old Balls and Bins , 2005, Self-star Properties in Complex Information Systems.

[24]  Anthony K. H. Tung,et al.  On High Dimensional Skylines , 2006, EDBT.

[25]  Beng Chin Ooi,et al.  Skyline Queries Against Mobile Lightweight Devices in MANETs , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[26]  Tian Xia,et al.  Refreshing the sky: the compressed skycube with efficient support for frequent updates , 2006, SIGMOD Conference.

[27]  Katja Hose,et al.  Processing relaxed skylines in PDMS using distributed data summaries , 2006, CIKM '06.

[28]  Jian Pei,et al.  SUBSKY: Efficient Computation of Skylines in Subspaces , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[29]  Bernhard Seeger,et al.  Constrained subspace skyline computation , 2006, CIKM '06.

[30]  Ben Y. Zhao,et al.  Parallelizing Skyline Queries for Scalable Distribution , 2006, EDBT.

[31]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[32]  Cyrus Shahabi,et al.  The spatial skyline queries , 2006, VLDB.

[33]  Anthony K. H. Tung,et al.  On Dominating Your Neighborhood Profitably , 2007, VLDB.

[34]  Bin Jiang,et al.  Probabilistic Skylines on Uncertain Data , 2007, VLDB.

[35]  Xuemin Lin,et al.  Selecting Stars: The k Most Representative Skyline Operator , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[36]  Anthony K. H. Tung,et al.  Efficient Skyline Query Processing on Peer-to-Peer Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[37]  Heng Tao Shen,et al.  Multi-source Skyline Query Processing in Road Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[38]  Jian Pei,et al.  Computing Compressed Multidimensional Skyline Cubes Efficiently , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[39]  Christos Doulkeridis,et al.  SKYPEER: Efficient Subspace Skyline Computation over Distributed Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[40]  Ömer Egecioglu,et al.  DeltaSky: Optimal Maintenance of Skyline Deletions without Exclusive Dominance Region Generation , 2007, 2007 IEEE 23rd International Conference on Data Engineering.