论文信息 - Supporting ad-hoc ranking aggregates

Supporting ad-hoc ranking aggregates

This paper presents a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries, which provide the k groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a naïve materialize-group-sort scheme that can be prohibitively inefficient. Our framework is based on three fundamental principles. The Upper-Bound Principle dictates the requirements of early pruning, and the Group-Ranking and Tuple-Ranking Principles dictate group-ordering and tuple-ordering requirements. They together guide the query processor toward a provably optimal tuple schedule for aggregate query processing. We propose a new execution framework to apply the principles and requirements. We address the challenges in realizing the framework and implementing new query operators, enabling efficient group-aware and rank-aware query plans. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional plans.

Kevin Chen-Chuan Chang | Ihab F. Ilyas | Chengkai Li

[1] Werner Nutt,et al. Rewriting aggregate queries using views , 1999, PODS.

[2] Luis Gravano,et al. Evaluating Top-k Selection Queries , 1999, VLDB.

[3] Ronald Fagin,et al. Combining Fuzzy Information from Multiple Systems , 1999, J. Comput. Syst. Sci..

[4] Luis Gravano,et al. Evaluating top-k queries over Web-accessible databases , 2002, Proceedings 18th International Conference on Data Engineering.

[5] Jeffrey D. Ullman,et al. Index selection for OLAP , 1997, Proceedings 13th International Conference on Data Engineering.

[6] Kenneth A. Ross,et al. Fast Computation of Sparse Datacubes , 1997, VLDB.

[7] Timos K. Sellis,et al. The Generalized Pre-Grouping Transformation: Aggregate-Query Optimization in the Presence of Dependencies , 2003, VLDB.

[8] Guido Moerkotte,et al. A Combined Framework for Grouping and Order Optimization , 2004, VLDB.

[9] Rajeev Motwani,et al. Computing Iceberg Queries Efficiently , 1998, VLDB.

[10] Ronald Fagin,et al. Combining fuzzy information from multiple systems (extended abstract) , 1996, PODS.

[11] Kevin Chen-Chuan Chang,et al. Efficient Processing of Ad-Hoc Top-k Aggregate Queries in OLAP , 2005 .