SkyDB: Skyline Aware Query Evaluation Framework

In recent years much attention has been focused on evaluating skylines, however the existing techniques primarily focus on skyline algorithms over single sets. These techniques face two serious limitations, namely (1) they define skylines to work on a single set only, and (2), they treat skylines as an “add-on”, loosely integrated on top of the query plan. In this work, we investigate the evaluation of skylines over disparate sources via joins. We then propose SkyDB a skyline aware query evaluation framework that addresses four key issues that enable the treatment of skylines as a first-class citizen in query processing. First, we extend the relational model to include skyline-aware operators. Second, for there new operators we design execution strategies that are tuned to exploit the skyline knowledge. Third, we propose our skyline aware query optimizer to effectively choose between the query plan execution strategies. In the literature, we observe that evaluating of skylines over joins is considered to be blocking. Therefore, existing approaches focus only on reducing the skyline evaluation time - rendering them inapplicable for response-time sensitive applications. Fourth, we thus aim to transform the execution of skylines over joins to non-blocking so that SkyDB can produce progressive output of results. Our preliminary performance study demonstrates the superiority of our proposed methodologies over existing techniques by outperforming them in many cases by several orders of magnitude.

[1]  Christian Buchta,et al.  On the Average Number of Maxima in a Set of Vectors , 1989, Inf. Process. Lett..

[2]  Ilaria Bartolini,et al.  SaLSa: computing the skyline without scanning the whole sky , 2006, CIKM '06.

[3]  Walid G. Aref,et al.  NILE-PDT: A Phenomenon Detection and Tracking Framework for Data Stream Management Systems , 2005, VLDB.

[4]  Anthony K. H. Tung,et al.  Relaxing join and selection queries , 2006, VLDB.

[5]  Volker Markl,et al.  Progressive optimization in a shared-nothing parallel database , 2007, SIGMOD '07.

[6]  Elke A. Rundensteiner,et al.  Skyline and Mapping Aware Query Evaluation Across Disparate Data Sources , 2009 .

[7]  Parke Godfrey,et al.  Skyline Cardinality for Relational Processing , 2004, FoIKS.

[8]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[9]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[10]  Werner Kießling,et al.  Foundations of Preferences in Database Systems , 2002, VLDB.

[11]  Anthony K. H. Tung,et al.  Skyline-join in distributed databases , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[12]  Jiawei Han,et al.  The Multi-Relational Skyline Operator , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[13]  Surajit Chaudhuri,et al.  Robust Cardinality and Cost Estimation for Skyline Operator , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[14]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[15]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[16]  Z. Weng,et al.  ZDOCK: An initial‐stage protein‐docking algorithm , 2003, Proteins.

[17]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[18]  Werner Kießling,et al.  Optimization of Relational Preference Queries , 2005, ADC.

[19]  Nick Koudas,et al.  Interactive query refinement , 2009, EDBT '09.

[20]  H. T. Kung,et al.  On the Average Number of Maxima in a Set of Vectors and Applications , 1978, JACM.

[21]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.