Communication-Efficient Distributed Skyline Computation

In this paper we study skyline queries in the distributed computational model, where we have s remote sites and a central coordinator; each site holds a piece of data, and the coordinator wants to compute the skyline of the union of the s datasets. The computation is in terms of rounds, and the goal is to minimize both the total communication cost and the round cost. We first give an algorithm with a small communication cost but potentially a large round cost; we show information-theoretically that the communication cost is optimal even if we allow an infinite number of communication rounds. We next give algorithms with smooth communication-round tradeoffs. We also show a strong lower bound for the communication cost if we can only use one round of communication. Finally, we demonstrate the superiority of our algorithms over existing ones by an extensive set of experiments on both synthetic and real world datasets.

[1]  Dan Suciu,et al.  Parallel Skyline Queries , 2012, Theory of Computing Systems.

[2]  Hua Lu,et al.  Constrained Skyline Query Processing against Distributed Data Sites , 2011, IEEE Transactions on Knowledge and Data Engineering.

[3]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[4]  Kjetil Nørvåg,et al.  Bandwidth-constrained distributed skyline computation , 2009, MobiDE.

[5]  David G. Kirkpatrick,et al.  Output-size sensitive algorithms for finding maximal vectors , 1985, SCG '85.

[6]  Christos Doulkeridis,et al.  AGiDS: A Grid-Based Strategy for Distributed Skyline Query Processing , 2009, Globe.

[7]  Evaggelia Pitoura,et al.  BITPEER: continuous subspace skyline computation with distributed bitmap indexes , 2008, DaMaP '08.

[8]  Christos Doulkeridis,et al.  SKYPEER: Efficient Subspace Skyline Computation over Distributed Data , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  Christos Doulkeridis,et al.  Angle-based space partitioning for efficient parallel skyline computation , 2008, SIGMOD Conference.

[10]  Ben Y. Zhao,et al.  Parallelizing Skyline Queries for Scalable Distribution , 2006, EDBT.

[11]  Christos Doulkeridis,et al.  Efficient Routing of Subspace Skyline Queries over Highly Distributed Data , 2010, IEEE Transactions on Knowledge and Data Engineering.

[12]  Hua Lu,et al.  Efficient Skyline Computation in MapReduce , 2014, EDBT.

[13]  Anthony K. H. Tung,et al.  Efficient Skyline Query Processing on Peer-to-Peer Networks , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[14]  Katja Hose,et al.  Processing relaxed skylines in PDMS using distributed data summaries , 2006, CIKM '06.

[15]  Katja Hose,et al.  A survey of skyline processing in highly distributed environments , 2011, The VLDB Journal.

[16]  Hua Lu,et al.  iSky: Efficient and Progressive Skyline Computing in a Structured P2P Network , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[17]  Robert E. Tarjan,et al.  Scaling and related techniques for geometry problems , 1984, STOC '84.

[18]  Jan Chomicki,et al.  Skyline queries, front and back , 2013, SGMD.

[19]  Anthony K. H. Tung,et al.  Skyframe: a framework for skyline query processing in peer-to-peer systems , 2008, The VLDB Journal.

[20]  Beng Chin Ooi,et al.  Skyline Queries Against Mobile Lightweight Devices in MANETs , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Mark Braverman,et al.  Tight Bounds for Set Disjointness in the Message Passing Model , 2013, ArXiv.

[22]  Christos Doulkeridis,et al.  Efficient execution plans for distributed skyline query processing , 2011, EDBT/ICDT '11.

[23]  Yufei Tao,et al.  Distributed Skyline Retrieval with Low Bandwidth Consumption , 2009, IEEE Transactions on Knowledge and Data Engineering.

[24]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[25]  Xiaofeng Xu,et al.  Faster output-sensitive skyline computation algorithm , 2014, Inf. Process. Lett..