Skyline query processing for incomplete data in cloud environment

Many research works have been conducted focusing on pro-cessing skyline queries on databases. Recently, some approaches have been proposed to address the issue of skyline queries for a partially complete da-tabase in which data item values might not be presented (missing). Howev-er, these approaches are tailored for centralized database and accessed only one table to identify the skylines. Nevertheless, in many contemporary data-base applications, this is might not be the case, particularly for a database with incomplete data and many tables spread over various remote locations such as cloud environment. Applying skyline approaches designed for cen-tralized database directly on cloud databases is undesirable due to the pro-hibitive cost of transferring the amount of data from one datacenter to an-other during skyline process. An approach is needed taking into considera-tion the unique features of cloud environment when processing skyline que-ries on a database with incomplete data. This paper proposes an approach that evaluates skyline queries in a database with partially incomplete data over the cloud. The approach aims at reducing the number of pairwise com-parisons that needs to be conducted between data items and the amount of data transferred in identifying skylines. Several experiments over synthetic and real datasets have been conducted to evaluate the performance of our approach. The result shows that our approach outperforms the previous ap-proach in terms of a number of pairwise comparisons and amount of data transferred.

[1]  Yasuhiko Morimoto,et al.  SKYLINE SETS QUERIES FOR INCOMPLETE DATA , 2012 .

[2]  Anthony K. H. Tung,et al.  Skyline-join in distributed databases , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[3]  Hamidah Ibrahim,et al.  Processing skyline queries in incomplete distributed databases , 2016, Journal of Intelligent Information Systems.

[4]  Christos Doulkeridis,et al.  Skyline query processing over joins , 2011, SIGMOD '11.

[5]  Seung-won Hwang,et al.  Scalable skyline computation using a balanced pivot selection technique , 2014, Inf. Syst..

[6]  Hamidah Ibrahim,et al.  An Efficient Approach for Processing Skyline Queries in Incomplete Multidimensional Database , 2016 .

[7]  P. Sreenivasa Kumar,et al.  Finding Skylines for Incomplete Data , 2013, ADC.

[8]  Ali Amer Alwan,et al.  A Framework for Evaluating Skyline Queries over Incomplete Data , 2016, FNC/MobiSPC.

[9]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[10]  Jinchao Zhang Efficient Skyline Query over Multiple Relations , 2016, ICCS.

[11]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[12]  Gang Chen,et al.  On Efficient k-Skyband Query Processing over Incomplete Data , 2013, DASFAA.

[13]  Mohamed F. Mokbel,et al.  Skyline Query Processing for Incomplete Data , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[15]  Anastasios Kementsietsidis,et al.  Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data , 2001, SIGMOD 2011.