Noname manuscript No. (will be inserted by the editor) In-Network Outlier Detection in Wireless Sensor Networks

To address the problem of unsupervised outlier detection in wireless sensor networks, we develop an approach that (1) is flexible with respect to the outlier definition, (2) computes the result in-network to reduce both bandwidth and energy consumption, (3) uses only single-hop communication, thus permitting very simple node failure detection and message reliability assurance mechanisms (e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data. We examine performance by simulation, using real sensor data streams. Our results demonstrate that our approach is accurate and imposes reasonable communication and power consumption demands.

[1]  Rajeev Motwani,et al.  The price of validity in dynamic networks , 2004, SIGMOD '04.

[2]  Stephen P. Boyd,et al.  Gossip algorithms: design, analysis and applications , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[3]  Hillol Kargupta,et al.  Uniform Data Sampling from a Peer-to-Peer Network , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[4]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[5]  Martin Meckesheimer,et al.  Automatic outlier detection for time series: an application to sensor data , 2007, Knowledge and Information Systems.

[6]  Ramesh Govindan,et al.  Understanding packet delivery performance in dense wireless sensor networks , 2003, SenSys '03.

[7]  Charles E. Perkins,et al.  Ad-hoc on-demand distance vector routing , 1999, Proceedings WMCSA'99. Second IEEE Workshop on Mobile Computing Systems and Applications.

[8]  Stephen D. Bay,et al.  Mining distance-based outliers in near linear time with randomization and a simple pruning rule , 2003, KDD '03.

[9]  ShimKyuseok,et al.  Efficient algorithms for mining outliers from large data sets , 2000 .

[10]  Jian Li,et al.  Exact and Approximate Solutions of Source Localization Problems , 2008, IEEE Transactions on Signal Processing.

[11]  Kenneth P. Birman,et al.  Advances in Pervasive Computing and Networking , 2004 .

[12]  Boleslaw K. Szymanski,et al.  Self-healing routing: a study in efficiency and resiliency of data delivery in wireless sensor networks , 2007, SPIE Defense + Commercial Sensing.

[13]  R. H. Moore,et al.  Some Grubbs-Type Statistics for the Detection of Several Outliers , 1972 .

[14]  Satish Kumar,et al.  Next century challenges: scalable coordination in sensor networks , 1999, MobiCom.

[15]  Ran Wolff,et al.  Association rule mining in peer-to-peer systems , 2003, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[16]  M. Palaniswami,et al.  Distributed Anomaly Detection in Wireless Sensor Networks , 2006, 2006 10th IEEE Singapore International Conference on Communication Systems.

[17]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[18]  Boleslaw K. Szymanski,et al.  ESCORT: Energy-Efficient Sensor Network Communal Routing Topology Using Signal Quality Metrics , 2005, ICN.

[19]  Hui Xiong,et al.  Distributed classification in peer-to-peer networks , 2007, KDD '07.

[20]  Philip K. Chan,et al.  Advances in Distributed and Parallel Knowledge Discovery , 2000 .

[21]  Richard M. Murray,et al.  Asynchronous Distributed Averaging on Communication Networks , 2007, IEEE/ACM Transactions on Networking.

[22]  Dimitrios Gunopulos,et al.  Distributed deviation detection in sensor networks , 2003, SGMD.

[23]  Yu Hen Hu,et al.  Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks , 2005, IEEE Transactions on Signal Processing.

[24]  Lei Chen,et al.  A Weighted Moving Average-based Approach for Cleaning Sensor Data , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[25]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[26]  Charles E. Perkins,et al.  Ad hoc On-Demand Distance Vector (AODV) Routing , 2001, RFC.

[27]  Ran Wolff,et al.  A Local Facility Location Algorithm for Large-scale Distributed Systems , 2007, Journal of Grid Computing.

[28]  M WojtekKowalczyk,et al.  Towards Data Mining in Large and Fully Distributed Peer-to-Peer Overlay Networks , 2003 .

[29]  Mani Srivastava,et al.  STEM: Topology management for energy efficient sensor networks , 2002, Proceedings, IEEE Aerospace Conference.

[30]  Ujjwal Maulik,et al.  Clustering distributed data streams in peer-to-peer environments , 2006, Inf. Sci..

[31]  Marco Zuniga,et al.  Analyzing the transitional region in low power wireless links , 2004, 2004 First Annual IEEE Communications Society Conference on Sensor and Ad Hoc Communications and Networks, 2004. IEEE SECON 2004..

[32]  George H. John Robust Decision Trees: Removing Outliers from Databases , 1995, KDD.

[33]  T. Ajdler,et al.  Acoustic source localization in distributed sensor networks , 2004, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004..

[34]  Nael B. Abu-Ghazaleh,et al.  A taxonomy of wireless micro-sensor network models , 2002, MOCO.

[35]  I. Hamzaoglu H. Kargupta,et al.  Distributed Data Mining Using An Agent Based Architecture , 1997, KDD 1997.

[36]  Deborah Estrin,et al.  ASCENT : Adaptive Self-Configuring sEnsor Networks Topologies . , 2002 .

[37]  Boleslaw K. Szymanski,et al.  Self-selective routing for wireless ad hoc networks , 2005, WiMob'2005), IEEE International Conference on Wireless And Mobile Computing, Networking And Communications, 2005..

[38]  Andreas Willig,et al.  Protocols and Architectures for Wireless Sensor Networks , 2005 .

[39]  Gyula Simon,et al.  Sensor network-based countersniper system , 2004, SenSys '04.

[40]  Vijayalakshmi Atluri,et al.  Neighborhood based detection of anomalies in high dimensional spatio-temporal sensor datasets , 2004, SAC '04.

[41]  Hillol Kargupta,et al.  K-Means Clustering Over a Large, Dynamic Network , 2006, SDM.

[42]  Ran Wolff,et al.  Distributed Decision‐Tree Induction in Peer‐to‐Peer Systems , 2008, Stat. Anal. Data Min..

[43]  Assaf Schuster,et al.  A geometric approach to monitoring threshold functions over distributed data streams , 2006, Ubiquitous Knowledge Discovery.

[44]  Klemens Böhm,et al.  Proceedings of the International Conference on Very Large Data Bases , 2005 .

[45]  Ian F. Akyildiz,et al.  Sensor Networks , 2002, Encyclopedia of GIS.

[46]  Liang Su,et al.  Continuous Adaptive Outlier Detection on Distributed Data Streams , 2007, HPCC.

[47]  Kenji Satou,et al.  ASYNCHRONOUS PEER-TO-PEER COMMUNICATION FOR FAILURE RESILIENT DISTRIBUTED GENETIC ALGORITHMS , 2003 .

[48]  Dimitrios Gunopulos,et al.  Online outlier detection in sensor data using non-parametric models , 2006, VLDB.

[49]  Boleslaw K. Szymanski,et al.  SENSE: A WIRELESS SENSOR NETWORK SIMULATOR , 2005 .

[50]  Daryl E. Hershberger,et al.  Collective Data Mining: a New Perspective toward Distributed Data Mining Advances in Distributed Data Mining Book , 1999 .

[51]  Deborah Estrin,et al.  ASCENT: adaptive self-configuring sensor networks topologies , 2004, IEEE Transactions on Mobile Computing.

[52]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[53]  Panganamala Ramana Kumar,et al.  RHEINISCH-WESTFÄLISCHE TECHNISCHE HOCHSCHULE AACHEN , 2001 .

[54]  Lei Chen,et al.  In-network Outlier Cleaning for Data Collection in Sensor Networks , 2006, CleanDB.

[55]  Ran Wolff,et al.  Local L2-Thresholding Based Data Mining in Peer-to-Peer Systems , 2006, SDM.

[56]  Hongxing He,et al.  Outlier Detection Using Replicator Neural Networks , 2002, DaWaK.

[57]  D. Janakiram,et al.  Outlier Detection in Wireless Sensor Networks using Bayesian Belief Networks , 2006, 2006 1st International Conference on Communication Systems Software & Middleware.

[58]  Bo Sheng,et al.  Outlier detection in sensor networks , 2007, MobiHoc '07.

[59]  Osmar R. Zaïane,et al.  Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data , 2008, Knowledge and Information Systems.

[60]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[61]  Takio Kurita,et al.  A neural network classifier for occluded images , 2002, Object recognition supported by user interaction for service robots.

[62]  Ran Wolff,et al.  Distributed Decision-Tree Induction in Peer-to-Peer Systems , 2008 .

[63]  Hillol Kargupta,et al.  A Scalable Local Algorithm for Distributed Multivariate Regression , 2008, Stat. Anal. Data Min..

[64]  Krishna M. Sivalingam,et al.  Learning from class-imbalanced data in wireless sensor networks , 2003, 2003 IEEE 58th Vehicular Technology Conference. VTC 2003-Fall (IEEE Cat. No.03CH37484).

[65]  Kun Liu,et al.  Distributed Identification of Top-l Inner Product Elements and its Application in a Peer-to-Peer Network , 2008, IEEE Transactions on Knowledge and Data Engineering.

[66]  Ran Wolff,et al.  A Local Facility Location Algorithm for Sensor Networks , 2005, DCOSS.

[67]  Ajith Abraham,et al.  Improving kNN Text Categorization by Removing Outliers from Training Set , 2006, CICLing.

[68]  Ran Wolff,et al.  In-Network Outlier Detection in Wireless Sensor Networks , 2006, ICDCS.

[69]  Srinivasan Parthasarathy,et al.  Fast Distributed Outlier Detection in Mixed-Attribute Data Sets , 2006, Data Mining and Knowledge Discovery.

[70]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[71]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[72]  Clara Pizzuti,et al.  Fast Outlier Detection in High Dimensional Spaces , 2002, PKDD.

[73]  Tomi Kinnunen,et al.  Improving K-Means by Outlier Removal , 2005, SCIA.

[74]  Hillol Kargupta,et al.  Distributed probabilistic inferencing in sensor networks using variational approximation , 2008, J. Parallel Distributed Comput..

[75]  Deborah Estrin,et al.  Geography-informed energy conservation for Ad Hoc routing , 2001, MobiCom '01.