Distributed deviation detection in sensor networks

Sensor networks have recently attracted much attention, because of their potential applications in a number of different settings. The sensors can be deployed in large numbers in wide geographical areas, and can be used to monitor physical phenomena, or to detect certain events.An interesting problem which has not been adequately addressed so far is that of distributed online deviation detection in streaming data. The identification of deviating values provides an efficient way to focus on the interesting events in the sensor network.In this work, we propose a technique for online deviation detection in streaming data. We discuss how these techniques can operate efficiently in the distributed environment of a sensor network, and discuss the tradeoffs that arise in this setting. Our techniques process as much of the data as possible in a decentralized fashion, so as to avoid unnecessary communication and computational effort.

[1]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[2]  Samuel Madden,et al.  Fjording the stream: an architecture for queries over streaming sensor data , 2002, Proceedings 18th International Conference on Data Engineering.

[3]  Ambuj K. Singh,et al.  An Adaptive and Scalable Middleware for Distributed Indexing of Data Streams , 2003, DBISP2P.

[4]  Deborah Estrin,et al.  Impact of network density on data aggregation in wireless sensor networks , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[5]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[6]  Gabor Karsai,et al.  Smart Dust: communicating with a cubic-millimeter computer , 2001 .

[7]  Prabhakar Raghavan,et al.  A Linear Method for Deviation Detection in Large Databases , 1996, KDD.

[8]  Özgür B. Akan,et al.  ESRT: event-to-sink reliable transport in wireless sensor networks , 2003, MobiHoc '03.

[9]  Piotr Indyk,et al.  Comparing Data Streams Using Hamming Norms (How to Zero In) , 2002, VLDB.

[10]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[11]  Nimrod Megiddo,et al.  Discovery-Driven Exploration of OLAP Data Cubes , 1998, EDBT.

[12]  Divesh Srivastava,et al.  On computing correlated aggregates over continual data streams , 2001, SIGMOD '01.

[13]  Stefan Berchtold,et al.  Efficient Biased Sampling for Approximate Clustering and Outlier Detection in Large Data Sets , 2003, IEEE Trans. Knowl. Data Eng..

[14]  Dimitrios Gunopulos,et al.  Approximating multi-dimensional aggregate range queries over real attributes , 2000, SIGMOD 2000.

[15]  Mani B. Srivastava,et al.  A distributed computation platform for wireless embedded sensing , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[16]  Nick Koudas,et al.  Entropy based approximate querying and exploration of datacubes , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[17]  Rajeev Motwani,et al.  Sampling from a moving window over streaming data , 2002, SODA '02.

[18]  Wei Hong,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tag: a Tiny Aggregation Service for Ad-hoc Sensor Networks , 2022 .

[19]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[20]  S. Muthukrishnan,et al.  Estimating Rarity and Similarity over Data Stream Windows , 2002, ESA.

[21]  Chenyang Lu,et al.  RAP: a real-time communication architecture for large-scale wireless sensor networks , 2002, Proceedings. Eighth IEEE Real-Time and Embedded Technology and Applications Symposium.

[22]  Michael G. Corr,et al.  Statistically accurate sensor networking , 2002, 2002 IEEE Wireless Communications and Networking Conference Record. WCNC 2002 (Cat. No.02TH8609).

[23]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[24]  Dimitrios Gunopulos,et al.  Correlating synchronous and asynchronous data streams , 2003, KDD '03.

[25]  Johannes Gehrke,et al.  Query Processing in Sensor Networks , 2003, CIDR.

[26]  Rajeev Motwani,et al.  Maintaining variance and k-medians over data stream windows , 2003, PODS.

[27]  Dimitrios Gunopulos,et al.  Online amnesic approximation of streaming time series , 2004, Proceedings. 20th International Conference on Data Engineering.

[28]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[29]  S. Muthukrishnan,et al.  Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries , 2001, VLDB.

[30]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[31]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[32]  Rajeev Rastogi,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD 2000.

[33]  Wei Hong,et al.  Beyond Average: Toward Sophisticated Sensing with Queries , 2003, IPSN.

[34]  Deborah Estrin,et al.  Directed diffusion: a scalable and robust communication paradigm for sensor networks , 2000, MobiCom '00.