Top-<i>k</i> query asks for <i>k</i> tuples ordered according to a specific ranking function that combines the values from multiple participating attributes. The combined score function is usually linear. To efficiently answer top-<i>k</i> queries, preprocessing and indexing the data have been used to speed up the run time performance. Many indexing methods allow the online query algorithms progressively retrieve the data and stop at a certain point. However, in many cases, the number of data accesses is sensitive to the query parameters (i.e., linear weights in the score functions).In this paper, we study the sequentially layered indexing problem where tuples are put into multiple consecutive layers and any top-<i>k</i> query can be answered by at most <i>k</i> layers of tuples. We propose a new criterion for building the layered index. A layered index is robust if for any <i>k</i>, the number of tuples in the top <i>k</i> layers is minimal in comparison with all the other alternatives. The robust index guarantees the worst case performance for arbitrary query parameters. We derive a necessary and sufficient condition for robust index. The problem is shown solvable within O(n<sup>d</sup>log <i>n</i>) (where <i>d</i> is the number of dimensions, and <i>n</i> is the number of tuples). To reduce the high complexity of the exact solution, we develop an approximate approach, which has time complexity <i>O</i>(2<sup><i>d</i></sup> <i>n</i>(log <i>n</i>)<sup><i>r(d)</i>-1</sup>), where <i>r(d)</i> = ⌈<i>d</i>/2⌉ + ⌊<i>d</i>/2⌋ ⌈<i>d</i>/2⌉. Our experimental results show that our proposed method outperforms the best known previous methods.
[1]
Moni Naor,et al.
Optimal aggregation algorithms for middleware
,
2001,
PODS '01.
[2]
Ronald L. Rivest,et al.
Introduction to Algorithms
,
1990
.
[3]
Ronald Fagin,et al.
Combining fuzzy information from multiple systems (extended abstract)
,
1996,
PODS.
[4]
Donald L. Simon,et al.
Data structures in C
,
1995
.
[5]
Vagelis Hristidis,et al.
PREFER: a system for the efficient execution of multi-parametric ranked queries
,
2001,
SIGMOD '01.
[6]
William H. Ford,et al.
Data Structures With C
,
1996
.
[7]
Ronald Fagin,et al.
Combining Fuzzy Information from Multiple Systems
,
1999,
J. Comput. Syst. Sci..
[8]
Jeffrey Scott Vitter,et al.
Efficient searching with linear constraints
,
1998,
J. Comput. Syst. Sci..
[9]
Jonathan Goldstein,et al.
Processing queries by linear constraints
,
1997,
PODS '97.
[10]
John R. Smith,et al.
The onion technique: indexing for linear optimization queries
,
2000,
SIGMOD '00.
[11]
Ronald Fagin,et al.
Fuzzy queries in multimedia database systems
,
1998,
PODS '98.
[12]
Donald Kossmann,et al.
The Skyline operator
,
2001,
Proceedings 17th International Conference on Data Engineering.
[13]
Yuan-Chi Chang,et al.
The onion technique: indexing for linear optimization queries
,
2000,
SIGMOD 2000.
[14]
Luis Gravano,et al.
Evaluating Top-k Selection Queries
,
1999,
VLDB.