On Dominating Your Neighborhood Profitably

Recent research on skyline queries has attracted much interest in the database and data mining community. Given a database, an object belongs to the skyline if it cannot be dominated with respect to the given attributes by any other database object. Current methods have only considered so-called min/max attributes like price and quality which a user wants to minimize or maximize. However, objects can also have spatial attributes like x, y coordinates which can be used to represent relevant constraints on the query results. In this paper, we introduce novel skyline query types taking into account not only min/max attributes but also spatial attributes and the relationships between these different attribute types. Such queries support a micro-economic approach to decision making, considering not only the quality but also the cost of solutions. We investigate two alternative approaches for efficient query processing, a symmetrical one based on off-the-shelf index structures, and an asymmetrical one based on index structures with special purpose extensions. Our experimental evaluation using a real dataset and various synthetic datasets demonstrates that the new query types are indeed meaningful and the proposed algorithms are efficient and scalable.

[1]  Nick Roussopoulos,et al.  Nearest neighbor queries , 1995, SIGMOD '95.

[2]  Jirí Matousek,et al.  Computing Dominances in E^n , 1991, Inf. Process. Lett..

[3]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[4]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[5]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[6]  Bernhard Seeger,et al.  Progressive skyline computation in database systems , 2005, TODS.

[7]  Anthony K. H. Tung,et al.  DADA: a data cube for dominant relationship analysis , 2006, SIGMOD Conference.

[8]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[9]  Anthony K. H. Tung,et al.  Mining top-n local outliers in large databases , 2001, KDD '01.

[10]  Jon M. Kleinberg,et al.  Segmentation problems , 2004, JACM.

[11]  Ivan Stojmenovic,et al.  An optimal parallel algorithm for solving the maximal elements problem in the plane , 1988, Parallel Comput..

[12]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[13]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[14]  Frank Nielsen,et al.  Output-Sensitive Peeling of Convex and Maximal Layers , 1996, Inf. Process. Lett..

[15]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[16]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[17]  Jon M. Kleinberg,et al.  A Microeconomic View of Data Mining , 1998, Data Mining and Knowledge Discovery.

[18]  Bernhard Seeger,et al.  An optimal and progressive algorithm for skyline queries , 2003, SIGMOD '03.

[19]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[20]  Hongjun Lu,et al.  Stabbing the sky: efficient skyline computation over sliding windows , 2005, 21st International Conference on Data Engineering (ICDE'05).