SCSA: Evaluating skyline queries in incomplete data

Skyline queries have been extensively incorporated in various contemporary database applications. The list includes but is not limited to multi-criteria decision-making systems, decision support systems, and recommendation systems. Due to its great benefits and wide application range, many skyline algorithms have already been proposed in numerous data settings. Nonetheless, most researchers presume the completion of data meaning that all data item values are available. Since this assumption cannot be sustained in a large number of real-world database applications, the existing algorithms are rather inadequate to be directly applied on a database with incomplete data. In such cases, processing skyline queries on incomplete data incur exhaustive pairwise comparisons between data items, which may lead to loss of the transitivity property of the skyline technique. Losing the transitivity property may in turn give rise to the problem of cyclic dominance. In order to address these issues, we propose a new skyline algorithm called Sorting-based Cluster Skyline Algorithm (SCSA) that combines the sorting and partitioning techniques and simplifies the skyline computation on an incomplete dataset. These two techniques help boost the skyline process and avoid many unnecessary pairwise comparisons between data items to prune the dominated data items. The comprehensive experiments carried out on both synthetic and real-life datasets demonstrate the effectiveness and versatility of our approach as compared to the currently used approaches.

[1]  H. T. Kung,et al.  On the Average Number of Maxima in a Set of Vectors and Applications , 1978, JACM.

[2]  Sherzod Turaev,et al.  A Model for Processing Skyline Queries in Crowd-sourced Databases , 2018 .

[3]  Andreas Sumper,et al.  Pareto Optimal Reconfiguration of Power Distribution Systems Using a Genetic Algorithm Based on NSGA-II , 2013 .

[4]  Jignesh M. Patel,et al.  Efficient Continuous Skyline Computation , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[5]  Yonis Gulzar,et al.  Skyline query processing for incomplete data in cloud environment , 2017 .

[6]  Seung-won Hwang,et al.  Scalable skyline computation using a balanced pivot selection technique , 2014, Inf. Syst..

[7]  Ali Amer Alwan,et al.  Processing Skyline Queries in Incomplete Database: Issues, Challenges and Future Trends , 2017, J. Comput. Sci..

[8]  Hamidah Ibrahim,et al.  A Framework for Identifying Skylines over Incomplete Data , 2014, 2014 3rd International Conference on Advanced Computer Science Applications and Technologies.

[9]  Allel HadjAli,et al.  Computing Skyline from Evidential Data , 2014, SUM.

[10]  P. Sreenivasa Kumar,et al.  Finding Superior Skyline Points from Incomplete Data , 2013, COMAD.

[11]  James A. Rodger,et al.  A Petri Net Pareto ISO 31000 Workflow Process Decision Making Approach for Supply Chain Risk Trigger Inventory Decisions in Government Organizations , 2014 .

[12]  Man Lung Yiu,et al.  Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data , 2007, VLDB.

[13]  Ilaria Bartolini,et al.  SaLSa: computing the skyline without scanning the whole sky , 2006, CIKM '06.

[14]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[15]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[16]  Kyriakos Mouratidis,et al.  Continuous monitoring of top-k queries over sliding windows , 2006, SIGMOD Conference.

[17]  Nikos Mamoulis,et al.  Scalable skyline computation using object-based space partitioning , 2009, SIGMOD Conference.

[18]  Evaggelia Pitoura,et al.  BITPEER: continuous subspace skyline computation with distributed bitmap indexes , 2008, DaMaP '08.

[19]  Allel HadjAli,et al.  Efficient Skyline Maintenance over Frequently Updated Evidential Databases , 2016, IPMU.

[20]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[21]  Gang Chen,et al.  On Efficient k-Skyband Query Processing over Incomplete Data , 2013, DASFAA.

[22]  Raymond Chi-Wing Wong,et al.  Efficient skyline querying with variable user preferences on nominal attributes , 2008, Proc. VLDB Endow..

[23]  Ali Amer Alwan,et al.  A Framework for Evaluating Skyline Queries over Incomplete Data , 2016, FNC/MobiSPC.

[24]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[25]  Dongwon Lee,et al.  CrowdSky: Skyline Computation with Crowdsourcing , 2016, EDBT.

[26]  Hyeonseung Im,et al.  Optimizing skyline queries over incomplete data , 2016, Inf. Sci..

[27]  Mohamed F. Mokbel,et al.  Skyline Query Processing for Incomplete Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[28]  Jianzhong Li,et al.  ISSA: Efficient Skyline Computation for Incomplete Data , 2016, DASFAA Workshops.

[29]  P. Sreenivasa Kumar,et al.  Finding Skylines for Incomplete Data , 2013, ADC.

[30]  Allel HadjAli,et al.  Skyline queries over possibilistic RDF data , 2018, Int. J. Approx. Reason..

[31]  Allel HadjAli,et al.  Imperfect top-k skyline query with confidence level , 2016, 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA).

[32]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[33]  Kenneth L. Clarkson,et al.  Fast linear expected-time algorithms for computing maxima and convex hulls , 1993, SODA '90.

[34]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[35]  Yan Wang,et al.  Skyline Preference Query Based on Massive and Incomplete Dataset , 2017, IEEE Access.

[36]  Yuan Tian,et al.  Z-SKY: an efficient skyline query processing framework based on Z-order , 2010, The VLDB Journal.

[37]  Yannis Manolopoulos,et al.  Processing skyline queries in temporal databases , 2017, SAC.

[38]  Anthony K. H. Tung,et al.  On High Dimensional Skylines , 2006, EDBT.

[39]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[40]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[41]  Wolf-Tilo Balke,et al.  Skyline Queries over Incomplete Data - Error Models for Focused Crowd-Sourcing , 2013, ER.

[42]  Yasuhiko Morimoto,et al.  SKYLINE SETS QUERIES FOR INCOMPLETE DATA , 2012 .

[43]  Hamidah Ibrahim,et al.  An Efficient Approach for Processing Skyline Queries in Incomplete Multidimensional Database , 2016 .

[44]  Ihab F. Ilyas,et al.  Supporting ranking queries on uncertain and incomplete data , 2010, The VLDB Journal.

[45]  Ali Amer Alwan,et al.  A model for skyline query processing in a partially complete database , 2018 .