A Framework for Evaluating Skyline Queries over Incomplete Data

Research interest in skyline queries has been significantly increased over the years, as skyline queries can be utilized in many contemporary applications, such as multi-criteria decision-making system, decision support system, recommendation system, data mining, and personalized systems. Skyline queries return data item that is not dominated by any other data items in all dimensions (attributes). Most of the existing skyline approaches assumed that database is complete and values are present during the skyline process. However, such assumption is not always to be true, particularly in a real world database where values of data item might not be available (missing) in one or more dimensions. Thus, the incompleteness of the data impacts negatively on skyline process due to losing the transitivity property which leads into the issue of cyclic dominance. Therefore, applying skyline technique directly on an incomplete database is prohibitive and might result into exhaustive pairwise comparison. This paper presents an approach that efficiently evaluates skyline queries in incomplete database. The approach aims at reducing the number of pairwise comparisons and shortens the searching space in identifying the skylines. Several experiments have been conducted to demonstrate that our approach outperforms the previous approach through producing a lower number of pairwise comparisons. Furthermore, the result also illustrates that our approach is scalable and efficient.

[1]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[2]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[3]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[4]  Yannis Manolopoulos,et al.  Continuous Top-k Dominating Queries in Subspaces , 2008, 2008 Panhellenic Conference on Informatics.

[5]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[6]  P. Sreenivasa Kumar,et al.  Finding Skylines for Incomplete Data , 2013, ADC.

[7]  Florence Sèdes,et al.  LA-GPS : A location-aware geographical pervasive system , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[8]  Evaggelia Pitoura,et al.  BITPEER: continuous subspace skyline computation with distributed bitmap indexes , 2008, DaMaP '08.

[9]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[10]  Jignesh M. Patel,et al.  Efficient Continuous Skyline Computation , 2006, ICDE.

[11]  Anthony K. H. Tung,et al.  On High Dimensional Skylines , 2006, EDBT.

[12]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[13]  Man Lung Yiu,et al.  Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data , 2007, VLDB.

[14]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[15]  Mohamed F. Mokbel,et al.  Toward context and preference-aware location-based services , 2009, MobiDE.

[16]  Mohamed F. Mokbel,et al.  Skyline Query Processing for Incomplete Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[17]  Hamidah Ibrahim,et al.  An Efficient Approach for Processing Skyline Queries in Incomplete Multidimensional Database , 2016 .