Algorithms for distributional and adversarial pipelined filter ordering problems

Pipelined filter ordering is a central problem in database query optimization. The problem is to determine the optimal order in which to apply a given set of commutative filters (predicates) to a set of elements (the tuples of a relation), so as to find, as efficiently as possible, the tuples that satisfy all of the filters. Optimization of pipelined filter ordering has recently received renewed attention in the context of environments such as the Web, continuous high-speed data streams, and sensor networks. Pipelined filter ordering problems are also studied in areas such as fault detection and machine learning under names such as learning with attribute costs, minimum-sum set cover, and satisficing search. We present algorithms for two natural extensions of the classical pipelined filter ordering problem: (1) a distributional-type problem where the filters run in parallel and the goal is to maximize throughput, and (2) an adversarial-type problem where the goal is to minimize the expected value of multiplicative regret. We present two related algorithms for solving (1), both running in time O(n2), which improve on the O(n3 log n) algorithm of Kodialam. We use techniques from our algorithms for (1) to obtain an algorithm for (2).

[1]  Jennifer Widom,et al.  The Pipelined Set Cover Problem , 2005, ICDT.

[2]  Andreas Weininger Efficient execution of joins in a star schema , 2002, SIGMOD '02.

[3]  Baruch Awerbuch,et al.  A simple local-control approximation algorithm for multicommodity flow , 1993, Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science.

[4]  Éva Tardos,et al.  Fast Approximation Algorithms for Fractional Packing and Covering Problems , 1995, Math. Oper. Res..

[5]  Murali S. Kodialam The Throughput of Sequential Testing , 2001, IPCO.

[6]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[7]  U. Srivastava,et al.  Ordering Pipelined Query Operators with Precedence Constraints , 2005 .

[8]  László Lovász,et al.  Approximating Min Sum Set Cover , 2004, Algorithmica.

[9]  Wei Hong,et al.  Exploiting correlated attributes in acquisitional query processing , 2005, 21st International Conference on Data Engineering (ICDE'05).

[10]  Mihir Bellare,et al.  On Chromatic Sums and Distributed Resource Allocation , 1998, Inf. Comput..

[11]  Herbert A. Simon,et al.  Optimal Problem-Solving Search: All-Oor-None Solutions , 1975, Artif. Intell..

[12]  Jennifer Widom,et al.  Query optimization over web services , 2006, VLDB.

[13]  László Lovász,et al.  Approximating Min-sum Set Cover , 2002, APPROX.

[14]  Mark A. Shayman,et al.  Risk-sensitive decision-theoretic diagnosis , 2001, IEEE Trans. Autom. Control..

[15]  Haim Kaplan,et al.  Learning with attribute costs , 2005, STOC '05.

[16]  Roy Goldman,et al.  WSQ/DSQ: a practical approach for combined querying of databases and the Web , 2000, SIGMOD 2000.

[17]  Lisa Fleischer,et al.  Fast and simple approximation schemes for generalized flow , 2002, Math. Program..

[18]  Erol Gelenbe,et al.  Analysis and Synthesis of Computer Systems , 1980 .

[19]  Oren Etzioni,et al.  Efficient information gathering on the Internet , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[20]  Edith Cohen,et al.  Efficient sequences of trials , 2003, SODA '03.

[21]  Edward G. Coffman,et al.  A Characterization of Waiting Time Performance Realizable by Single-Server Queues , 1980, Oper. Res..

[22]  Carlo Zaniolo,et al.  Optimization of Nonrecursive Queries , 1986, VLDB.

[23]  Jennifer Widom,et al.  Adaptive ordering of pipelined stream filters , 2004, SIGMOD '04.

[24]  Roy Goldman,et al.  WSQ/DSQ: a practical approach for combined querying of databases and the Web , 2000, SIGMOD '00.

[25]  Leonid Khachiyan,et al.  Fast Approximation Schemes for Convex Programs with Many Blocks and Coupling Constraints , 1994, SIAM J. Optim..

[26]  Surajit Chaudhuri,et al.  Join queries with external text sources: execution and optimization techniques , 1995, SIGMOD '95.

[27]  M. R. Garey,et al.  Optimal task sequencing with precedence constraints , 1973, Discrete Mathematics.

[28]  Goetz Graefe,et al.  Multi-table joins through bitmapped join indices , 1995, SGMD.

[29]  Toshihide Ibaraki,et al.  On the optimal nesting order for computing N-relational joins , 1984, TODS.

[30]  Ning Wu,et al.  Flow algorithms for two pipelined filter ordering problems , 2006, PODS '06.