Efficient Processing of Top-k Dominating Queries on Multi-Dimensional Data

The top-k dominating query returns k data objects which dominate the highest number of objects in a dataset. This query is an important tool for decision support since it provides data analysts an intuitive way for finding significant objects. In addition, it combines the advantages of top-k and skyline queries without sharing their disadvantages: (i) the output size can be controlled, (ii) no ranking functions need to be specified by users, and (iii) the result is independent of the scales at different dimensions. Despite their importance, top-k dominating queries have not received adequate attention from the research community. In this paper, we design specialized algorithms that apply on indexed multi-dimensional data and fully exploit the characteristics of the problem. Experiments on synthetic datasets demonstrate that our algorithms significantly outperform a previous skyline-based approach, while our results on real datasets show the meaningfulness of top-k dominating queries.

[1]  Timos K. Sellis,et al.  A model for the prediction of R-tree performance , 1996, PODS.

[2]  Qing Liu,et al.  Efficient Computation of the Skyline Cube , 2005, VLDB.

[3]  Panos Kalnis,et al.  Efficient OLAP Operations in Spatial Data Warehouses , 2001, SSTD.

[4]  Kian-Lee Tan,et al.  Stratified computation of skylines with partially-ordered domains , 2005, SIGMOD '05.

[5]  Jian Pei,et al.  Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces , 2005, VLDB.

[6]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[7]  Jian Pei,et al.  SUBSKY: Efficient Computation of Skylines in Subspaces , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[8]  Mario A. López,et al.  STR: a simple and efficient algorithm for R-tree packing , 1997, Proceedings 13th International Conference on Data Engineering.

[9]  Kevin Chen-Chuan Chang,et al.  Supporting ad-hoc ranking aggregates , 2006, SIGMOD Conference.

[10]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[11]  Parke Godfrey,et al.  Skyline Cardinality for Relational Processing , 2004, FoIKS.

[12]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[13]  Jarek Gryz,et al.  Maximal Vector Computation in Large Data Sets , 2005, VLDB.

[14]  Hanan Samet,et al.  Distance browsing in spatial databases , 1999, TODS.

[15]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[16]  Arthur R. Butz,et al.  Alternative Algorithm for Hilbert's Space-Filling Curve , 1971, IEEE Transactions on Computers.

[17]  Anthony K. H. Tung,et al.  Finding k-dominant skylines in high dimensional space , 2006, SIGMOD Conference.

[18]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[19]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[20]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[21]  Anthony K. H. Tung,et al.  DADA: a data cube for dominant relationship analysis , 2006, SIGMOD Conference.

[22]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS '01.

[23]  Beng Chin Ooi,et al.  Skyline Queries Against Mobile Lightweight Devices in MANETs , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[24]  Anthony K. H. Tung,et al.  On High Dimensional Skylines , 2006, EDBT.

[25]  Sharad Mehrotra,et al.  Progressive approximate aggregate queries with a multi-resolution tree structure , 2001, SIGMOD '01.

[26]  Hongjun Lu,et al.  Stabbing the sky: efficient skyline computation over sliding windows , 2005, 21st International Conference on Data Engineering (ICDE'05).

[27]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[28]  Surajit Chaudhuri,et al.  Robust Cardinality and Cost Estimation for Skyline Operator , 2006, 22nd International Conference on Data Engineering (ICDE'06).