Processing Skyline Groups on Data Streams

Skyline is defined as a set of objects in a multidimensional dataset. It returns objects which are not dominated by any other objects in the set. An object p dominates object p' if and only if it is not worse than p' on all of the attributes (dimensions) and is better than p' on at least one attribute. Given the same kind of dataset, the skyline group query returns groups which are not dominated by any other groups in the set and each group has the same number of objects. Although the skyline group query has been investigated in recent years, most techniques are designed for static datasets. However, data are changing with time in many practical applications nowadays and query processing techniques for dynamic datasets are not available. Therefore, finding skyline groups on data streams are highly required. In this paper, we propose a new algorithm to find the skyline groups on data streams. We use two data structures CL and GSM to store the immediate results. CL stores the candidate objects that may become members of the skyline groups. GSM stores the immediate results of the dynamic programming. We do experiments on synthetic datasets. The experimental results show that the algorithms proposed can find skyline groups efficiently.

[1]  Yufei Tao,et al.  Maintaining sliding window skylines on data streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[2]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[3]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[4]  Zhiyang Li,et al.  Skyline Query Based on User Preference with MapReduce , 2014, 2014 IEEE 12th International Conference on Dependable, Autonomic and Secure Computing.

[5]  Hyeonseung Im,et al.  Group skyline computation , 2012, Inf. Sci..

[6]  Yunjun Gao,et al.  A novel approach for selecting the top skyline under users' references , 2010, 2010 2nd IEEE International Conference on Information Management and Engineering.

[7]  Yuan Tian,et al.  Z-SKY: an efficient skyline query processing framework based on Z-order , 2010, The VLDB Journal.

[8]  Jignesh M. Patel,et al.  Efficient Continuous Skyline Computation , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[9]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[10]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[11]  Mao Ye,et al.  U-Skyline: A New Skyline Query for Uncertain Databases , 2013, IEEE Transactions on Knowledge and Data Engineering.

[12]  Gautam Das,et al.  On Skyline Groups , 2012, IEEE Transactions on Knowledge and Data Engineering.

[13]  Myoung-Ho Kim,et al.  Efficient processing of multiple continuous skyline queries over a data stream , 2013, Inf. Sci..

[14]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[15]  Surajit Chaudhuri,et al.  Robust Cardinality and Cost Estimation for Skyline Operator , 2006, 22nd International Conference on Data Engineering (ICDE'06).