Top-k Skyline: A Unified Approach

The WWW has become a huge repository of information. For almost any knowledge domain there may exist thousands of available sources and billions of data instances. Many of these sources may publish irrelevant data. User-preference approaches have been defined to retrieve relevant data based on similarity, relevance or preference criteria specified by the user. Although many declarative languages can express user-preferences, considering this information during query optimization and evaluation remains as open problem. SQLf, Top-k and Skyline are three extensions of SQL to specify user-preferences. The first two filter irrelevant answers following a score-based paradigm. On the other hand, the latter produces relevant non-dominated answers using an order-based paradigm. The main objective of our work is to propose a unified approach that combines paradigms based on order and score. We propose physical operators for SQLf considering Skyline and Top-k features. Properties of those will be considered during query optimization and evaluation. We describe a Hybrid-Naive operator for producing only answers in the Pareto Curve with best score values. We have conducted initial experimental studies to compare the Hybrid operator, Skyline and SQLf.

[1]  Werner Kießling,et al.  Preference SQL - Design, Implementation, Experiences , 2002, VLDB.

[2]  Seung-won Hwang,et al.  Optimizing access cost for top-k queries over Web sources: a unified cost-based approach , 2005, 21st International Conference on Data Engineering (ICDE'05).

[3]  Jan Chomicki,et al.  Preference formulas in relational queries , 2003, TODS.

[4]  Beng Chin Ooi,et al.  Efficient Progressive Skyline Computation , 2001, VLDB.

[5]  Patrick Bosc,et al.  ON THE EFFICIENCY OF THE ALPHA-CUT DISTRIBUTION METHOD TO EVALUATE SIMPLE FUZZY RELATIONAL QUERIES , 1995 .

[6]  Vagelis Hristidis,et al.  PREFER: a system for the efficient execution of multi-parametric ranked queries , 2001, SIGMOD '01.

[7]  Beng Chin Ooi,et al.  Indexing for progressive skyline computation , 2003, Data Knowl. Eng..

[8]  Patrick Bosc,et al.  Integrating fuzzy queries into an existing database management system: An example , 1994, Int. J. Intell. Syst..

[9]  Walid G. Aref,et al.  Rank-aware query optimization , 2004, SIGMOD '04.

[10]  H. T. Kung,et al.  On the Average Number of Maxima in a Set of Vectors and Applications , 1978, JACM.

[11]  Seung-won Hwang,et al.  Minimal probing: supporting expensive predicates for top-k queries , 2002, SIGMOD '02.

[12]  Luis Gravano,et al.  Evaluating top-k queries over Web-accessible databases , 2002, Proceedings 18th International Conference on Data Engineering.

[13]  Walid G. Aref,et al.  Supporting top-kjoin queries in relational databases , 2004, The VLDB Journal.

[14]  Jan Chomicki,et al.  Querying with Intrinsic Preferences , 2002, EDBT.

[15]  Wolf-Tilo Balke,et al.  Multi-objective Query Processing for Database Systems , 2004, VLDB.

[16]  Mihalis Yannakakis,et al.  Multiobjective query optimization , 2001, PODS '01.

[17]  Wenfei Fan,et al.  Keys with Upward Wildcards for XML , 2001, DEXA.

[18]  Jan Chomicki,et al.  Semantic Optimization of Preference Queries , 2004, CDB.

[19]  Matthias Jarke,et al.  Advances in Database Technology — EDBT 2002 , 2002, Lecture Notes in Computer Science.

[20]  Donald Kossmann,et al.  The Skyline operator , 2001, Proceedings 17th International Conference on Data Engineering.

[21]  Divesh Srivastava,et al.  Constraint Databases and Application , 1995, Lecture Notes in Computer Science.

[22]  Moni Naor,et al.  Optimal aggregation algorithms for middleware , 2001, PODS.

[23]  Michael Ian Shamos,et al.  Computational geometry: an introduction , 1985 .

[24]  Werner Kießling,et al.  Foundations of Preferences in Database Systems , 2002, VLDB.

[25]  Wolf-Tilo Balke,et al.  Efficient Distributed Skylining for Web Information Systems , 2004, EDBT.

[26]  Wolf-Tilo Balke,et al.  Towards efficient multi-feature queries in heterogeneous environments , 2001, Proceedings International Conference on Information Technology: Coding and Computing.

[27]  Patrick Bosc,et al.  SQLf: a relational database language for fuzzy querying , 1995, IEEE Trans. Fuzzy Syst..

[28]  Marlene Goncalves,et al.  Preferred Skyline: A Hybrid Approach Between SQLf and Skyline , 2005, DEXA.

[29]  M. Lacroix,et al.  Preferences; Putting More Knowledge into Queries , 1987, VLDB.

[30]  Dieter Gawlick,et al.  Managing Expressions as Data in Relational Database Systems , 2003, CIDR.

[31]  Luis Gravano,et al.  Top-k selection queries over relational databases: Mapping strategies and performance evaluation , 2002, TODS.

[32]  Rakesh Agrawal,et al.  A framework for expressing and combining preferences , 2000, SIGMOD '00.

[33]  Jan Chomicki,et al.  Hippo: A System for Computing Consistent Answers to a Class of SQL Queries , 2004, EDBT.

[34]  Donald Kossmann,et al.  Shooting Stars in the Sky: An Online Algorithm for Skyline Queries , 2002, VLDB.

[35]  Parke Godfrey,et al.  Skyline Cardinality for Relational Processing , 2004, FoIKS.

[36]  Aristides Gionis,et al.  Automated Ranking of Database Query Results , 2003, CIDR.

[37]  Jan Chomicki,et al.  Skyline with presorting , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[38]  Ronald Fagin,et al.  Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[39]  Patrick Bosc,et al.  SQLf Query Functionality on Top of a Regular Relational Database Management System , 2000 .