Loyalty-based selection: retrieving objects that persistently satisfy criteria

A traditional query returns a set of objects that satisfy user defined criteria at the time query was issued. The results are based on the values of objects at query time and may be affected by outliers. Intuitively, an object better meets the user's needs if it persistently satisfies the criteria, i.e., it satisfies the criteria for majority of the time in the past T time units. In this paper, we propose a measure named loyalty that reflects how persistently an object satisfies the criteria. Formally, the loyalty of an object is the total time (in past T time units) it satisfies the query criteria. In this paper, we study top-k loyalty queries over sliding windows that continuously report k objects with the highest loyalties. Each object issues an update when it starts satisfying the criteria or when it stops satisfying the criteria. We show that the lower bound cost of updating the results of a top-k loyalty query is O(logN), for each object update, where N is the number of updates issued in last T time units. We conduct a detailed complexity analysis and show that our proposed algorithm is optimal. Moreover, effective pruning techniques are proposed to improve the efficiency. We experimentally verify the effectiveness of the proposed approach by comparing it with a classic sweep line algorithm.

[1]  Thomas Ottmann,et al.  Algorithms for Reporting and Counting Geometric Intersections , 1979, IEEE Transactions on Computers.

[2]  John Hershberger,et al.  Finding the Upper Envelope of n Line Segments in O(n log n) Time , 1989, Inf. Process. Lett..

[3]  Leonidas J. Guibas,et al.  Data structures for mobile data , 1997, SODA '97.

[4]  Walid G. Aref,et al.  SEA-CNN: scalable processing of continuous k-nearest neighbor queries in spatio-temporal databases , 2005, 21st International Conference on Data Engineering (ICDE'05).

[5]  David Maier,et al.  Semantics and evaluation techniques for window aggregates in data streams , 2005, SIGMOD '05.

[6]  Beng Chin Ooi,et al.  Multiple aggregations over data streams , 2005, SIGMOD '05.

[7]  Elke A. Rundensteiner,et al.  State-slice: new paradigm of multi-query optimization of window-based stream queries , 2006, VLDB.

[8]  Philip S. Yu,et al.  Processing moving queries over moving objects using motion-adaptive indexes , 2006, IEEE Transactions on Knowledge and Data Engineering.

[9]  Leonidas J. Guibas,et al.  A package for exact kinetic data structures and sweepline algorithms , 2007, Comput. Geom..

[10]  Muhammad Aamir Cheema,et al.  Multi-guarded safe zone: An effective technique to monitor moving circular range queries , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[11]  Muhammad Aamir Cheema,et al.  Influence zone: Efficiently processing reverse k nearest neighbors queries , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[12]  Xuemin Lin,et al.  Loyalty-based Retrieval of Objects That Satisfy Criteria Persistently , 2012 .