A Sampling-Based Approach to Optimizing Top-k Queries in Sensor Networks

Wireless sensor networks generate a vast amount of data. This data, however, must be sparingly extracted to conserve energy, usually the most precious resource in battery-powered sensors. When approximation is acceptable, a model-driven approach to query processing is effective in saving energy by avoiding contacting nodes whose values can be predicted or are unlikely to be in the result set. To optimize queries such as top-k, however, reasoning directly with models of joint probability distributions can be prohibitively expensive. Instead of using models explicitly, we propose to use samples of past sensor readings. Not only are such samples simple to maintain, but they are also computationally efficient to use in query optimization. With these samples, we can formulate the problem of optimizing approximate top-k queries under an energy constraint as a linear program. We demonstrate the power and flexibility of our sampling-based approach by developing a series of topk query planning algorithms with linear programming, which are capable of efficiently producing plans with better performance and novel features. We show that our approach is both theoretically sound and practically effective on simulated and real-world datasets.

[1]  Pierre A. Humblet,et al.  A Distributed Algorithm for Minimum-Weight Spanning Trees , 1983, TOPL.

[2]  Manfred K. Warmuth,et al.  The Weighted Majority Algorithm , 1994, Inf. Comput..

[3]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[4]  Yong Yao,et al.  The cougar approach to in-network query processing in sensor networks , 2002, SGMD.

[5]  Christopher Olston,et al.  Distributed top-k monitoring , 2003, SIGMOD '03.

[6]  Wei Hong,et al.  The design of an acquisitional query processor for sensor networks , 2003, SIGMOD '03.

[7]  Himanshu Gupta,et al.  Connected K-coverage problem in sensor networks , 2004, Proceedings. 13th International Conference on Computer Communications and Networks (IEEE Cat. No.04EX969).

[8]  Chaitanya Swamy,et al.  Stochastic optimization is (almost) as easy as deterministic optimization , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[9]  Wei Hong,et al.  Model-Driven Data Acquisition in Sensor Networks , 2004, VLDB.

[10]  Divyakant Agrawal,et al.  Medians and beyond: new aggregation techniques for sensor networks , 2004, SenSys '04.

[11]  C. Guestrin,et al.  Distributed regression: an efficient framework for modeling sensor network data , 2004, Third International Symposium on Information Processing in Sensor Networks, 2004. IPSN 2004.

[12]  Johannes Gehrke,et al.  Query Processing in Sensor Networks , 2003, CIDR.

[13]  Mohamed A. Sharaf,et al.  Balancing energy efficiency and quality of aggregate data in sensor networks , 2004, The VLDB Journal.

[14]  Wei Hong,et al.  Exploiting correlated attributes in acquisitional query processing , 2005, 21st International Conference on Data Engineering (ICDE'05).

[15]  Yannis Kotidis Snapshot queries: towards data-centric sensor networks , 2005, 21st International Conference on Data Engineering (ICDE'05).

[16]  Dimitrios Gunopulos,et al.  Data Acquisition in Sensor Networks with Large Memories , 2005, 21st International Conference on Data Engineering Workshops (ICDEW'05).

[17]  Roger Barga,et al.  Proceedings of the 22nd International Conference on Data Engineering Workshops, ICDE 2006, 3-7 April 2006, Atlanta, GA, USA , 2006, ICDE Workshops.