Operator placement for in-network stream query processing

In sensor networks, data acquisition frequently takes place at low-capability devices. The acquired data is then transmitted through a hierarchy of nodes having progressively increasing network band-width and computational power. We consider the problem of executing queries over these data streams, posed at the root of the hierarchy. To minimize data transmission, it is desirable to perform "in-network" query processing: do some part of the work at intermediate nodes as the data travels to the root. Most previous work on in-network query processing has focused on aggregation and inexpensive filters. In this paper, we address in-network processing for queries involving possibly expensive conjunctive filters, and joins. We consider the problem of placing operators along the nodes of the hierarchy so that the overall cost of computation and data transmission is minimized. We show that the problem is tractable, give an optimal algorithm, and demonstrate that a simpler greedy operator placement algorithm can fail to find the optimal solution. Finally we define a number of interesting variations of the basic operator placement problem and demonstrate their hardness.

[1]  Surajit Chaudhuri,et al.  Optimization of queries with user-defined predicates , 1996, TODS.

[2]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[3]  Ying Xing,et al.  Scalable Distributed Stream Processing , 2003, CIDR.

[4]  Theodore Johnson,et al.  Gigascope: high performance network monitoring with an SQL interface , 2002, SIGMOD '02.

[5]  Wei Hong,et al.  The design of an acquisitional query processor for sensor networks , 2003, SIGMOD '03.

[6]  Michael Stonebraker,et al.  Predicate migration: optimizing queries with expensive predicates , 1992, SIGMOD Conference.

[7]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[8]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[9]  Elisa Bertino,et al.  Research Direction in Query Optimization at the University of Maryland. , 1982 .

[10]  Jennifer Widom,et al.  The Pipelined Set Cover Problem , 2005, ICDT.

[11]  Jeffrey F. Naughton,et al.  Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources , 2003, VLDB.

[12]  Jeffrey F. Naughton,et al.  Evaluating window joins over unbounded streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[13]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD '00.

[14]  Michael Stonebraker,et al.  Distributed query processing in a relational data base system , 1978, SIGMOD Conference.

[15]  Matt Welsh,et al.  Path Optimization in Stream-Based Overlay Networks , 2004 .

[16]  Srinivasan Seshan,et al.  Cache-and-query for wide area sensor databases , 2003, SIGMOD '03.

[17]  László Lovász,et al.  Approximating Min Sum Set Cover , 2004, Algorithmica.

[18]  Frederick Reiss,et al.  Design Considerations for High Fan-In Systems: The HiFi Approach , 2005, CIDR.