Real time clustering of sensory data in wireless sensor networks

Data mining in wireless sensor networks (WSNs) is a new emerging research area. This paper investigates the problem of real time clustering of sensory data in WSNs. The objective is to cluster the data collected by sensor nodes in real time according to data similarity in a d-dimensional sensory data space. To perform in-network data clustering efficiently, a Hilbert Curves based mapping algorithm, HilbertMap, is proposed to convert a d-dimensional sensory data space into a two-dimensional area covered by a sensor network. Based on this mapping, a distributed algorithm for clustering sensory data, H-Cluster, is proposed. It guarantees that the communications for sensory data clustering mostly occur among geographically nearby sensor nodes and sensory data clustering is accomplished in in-network manner. Extensive simulation experiments were conducted using both real-world datasets and synthetic datasets to evaluate the algorithms. H-Cluster consistently achieves the lowest data loss rate, the highest energy efficiency, and the best clustering quality.

[1]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[2]  Ambuj K. Singh,et al.  Distributed Spatial Clustering in Sensor Networks , 2006, EDBT.

[3]  Edward J. Coyle,et al.  An energy efficient hierarchical clustering algorithm for wireless sensor networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[4]  Dimitrios Gunopulos,et al.  Online outlier detection in sensor data using non-parametric models , 2006, VLDB.

[5]  Carla Giansante,et al.  The use of geographic information systems in sea and freshwater ecosystems. , 2007, Veterinaria italiana.

[6]  Chiang Lee,et al.  Supporting Multi-Dimensional Range Query for Sensor Networks , 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07).

[7]  Ran Wolff,et al.  In-Network Outlier Detection in Wireless Sensor Networks , 2006, ICDCS.

[8]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[9]  Qiang Yang,et al.  Domain-constrained semi-supervised mining of tracking models in sensor networks , 2007, KDD '07.

[10]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[11]  Ben Kao,et al.  Online Algorithms for Mining Inter-stream Associations from Large Sensor Networks , 2005, PAKDD.

[12]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[13]  Young-Jin Kim,et al.  Multi-dimensional range queries in sensor networks , 2003, SenSys '03.

[14]  Wei Hong,et al.  The design of an acquisitional query processor for sensor networks , 2003, SIGMOD '03.

[15]  Ossama Younis,et al.  Distributed clustering in ad-hoc sensor networks: a hybrid, energy-efficient approach , 2004, IEEE INFOCOM 2004.

[16]  Jonathan K. Lawder Calculation of Mappings Between One and n-dimensional Values Using the Hilbert Space-filling Curve ⋆ , 2009 .

[17]  Brad Karp,et al.  GPSR : Greedy Perimeter Stateless Routing for Wireless , 2000, MobiCom 2000.

[18]  Christos Faloutsos,et al.  An environmental sensor network to determine drinking water quality and security , 2003, SGMD.

[19]  Mohamed Medhat Gaber,et al.  Clustering Distributed Time Series in Sensor Networks , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[20]  Adrian Perrig,et al.  Using Clustering Information for Sensor Network Localization , 2005, DCOSS.

[21]  Kay Römer,et al.  Distributed Mining of Spatio-Temporal Event Patterns in Sensor Networks , 2007 .