Developing a model that facilitates the representation and knowledge discovery on sensor data presents many challenges. With sensors reporting data at a very high frequency, resulting in large volumes of data, there is a need for a model that is memory efficient. Sensor networks have spatial characterstics which include the location of the sensors. In addition, sensor data incorporates temporal nature, and hence the model must also support the time dependence of the data. Balancing the conflicting requirements of simplicity, expressiveness, and storage efficiency is challenging. The model should also provide adequate support for the formulation of efficient algorithms for knowledge discovery. Though spatio-temporal data can be modeled using time expanded graphs, this model replicates the entire graph across time instants, resulting in high storage overhead and computationally expensive algorithms. In this chapter, we discuss a data model called Spatio-Temporal Sensor Graphs (STSG) to model sensor data, which allows the properties of edges and nodes to be modeled as a time series of measurement data. Data at each instant would consist of the measured value and the expected error. Also, we present several case studies illustrating how the proposed STSG model facilitates methods to find interesting patterns (e.g., growing hotspots) in sensor data. INTRODUCTION Finding novel and interesting spatio-temporal patterns in the ever increasing collection of sensor data is an important problem in several scientific domains. Many of these scientific domains collect sensor data in outdoor environments with underlying physical interactions. For example, in environmental science, a timely response to anticipated watershed/in-plant events (e.g., chemical spill, terrorism) to maintain water quality is required. Such a case occurred in Milwaukee, WI, in 1993 where a harmful pathogen (called Cryptosporidium parvum) outbreak occurred in the river streams that infected more than 400,000 people with more than 100 deaths. The estimated total cost for the treatment of outbreak-related illness was $96.2 million [7]. As was the case in Milwaukee, such failures typically are detected long after the exposure by observed spikes in doctor/hospital visits or sales of certain medicines. In addition to unplanned “natural” events like the Cryptosporidium episode, another concern regarding water supplies is an act of terrorism. Clearly, when public health is at stake, waiting for the illnesses and fatalities to arise is much too late and identifying and modeling these spatio-temporal patterns such as hotspots and growing hotspots from sensor graphs is important [14]. Other applications that generate similar sensor data may be traffic road systems where measurements of traffic flow and congestion are important, especially in emergency operations such as evacuations. © 2009 by Taylor & Francis Group, LLC P1: Shashi November 5, 2008 14:15 82329 82329 ̇C003 A Sensor Network Data Model for the Discovery of Spatio-Temporal Patterns 17
[1]
Michael Stonebraker,et al.
Linear Road: A Stream Data Management Benchmark
,
2004,
VLDB.
[2]
Ramesh Govindan,et al.
The Sensor Network as a Database
,
2002
.
[3]
Philippe Bonnet,et al.
Towards Sensor Database Systems
,
2001,
Mobile Data Management.
[4]
Shashi Shekhar,et al.
Detecting graph-based spatial outliers: algorithms and applications (a summary of results)
,
2001,
KDD '01.
[5]
Kay Römer,et al.
Middleware challenges for wireless sensor networks
,
2002,
MOCO.
[6]
Daniel Sawitzki,et al.
Implicit Maximization of Flows over Time
,
2004
.
[7]
Shashi Shekhar,et al.
Spatio-temporal Network Databases and Routing Algorithms: A Summary of Results
,
2007,
SSTD.
[8]
Joseph M. Hellerstein,et al.
Eddies: continuously adaptive query processing
,
2000,
SIGMOD '00.
[9]
Shashi Shekhar,et al.
Spatial Databases: A Tour
,
2003
.
[10]
C. Lu.
A Uniied Approach to Spatial Outliers Detection
,
2003
.
[11]
Robert L. Smith,et al.
Fastest Paths in Time-dependent Networks for Intelligent Vehicle-Highway Systems Application
,
1993,
J. Intell. Transp. Syst..
[12]
Shashi Shekhar,et al.
Time-Aggregated Graphs for Modeling Spatio-temporal Networks
,
2006,
J. Data Semant..
[13]
Charles E. Heckler,et al.
Applied Multivariate Statistical Analysis
,
2005,
Technometrics.
[14]
Satish Kumar,et al.
Next century challenges: scalable coordination in sensor networks
,
1999,
MobiCom.
[15]
M A Koopmanschap,et al.
[The cost of illness].
,
1992,
Nederlands tijdschrift voor geneeskunde.
[16]
S. Pallottino,et al.
Shortest Path Algorithms in Transportation models: classical and innovative aspects
,
1997
.
[17]
John S. Gulliver,et al.
Development of a commercial code-based two-fluid model for bubble plumes
,
2007,
Environ. Model. Softw..
[18]
Craig Gotsman,et al.
Distributed Graph Layout for Sensor Networks
,
2004,
Graph Drawing.
[19]
Matt Welsh,et al.
Sensor networks for emergency response: challenges and opportunities
,
2004,
IEEE Pervasive Computing.
[20]
W. R. Buckland,et al.
Outliers in Statistical Data
,
1979
.
[21]
Chee-Yee Chong,et al.
Sensor networks: evolution, opportunities, and challenges
,
2003,
Proc. IEEE.
[22]
Martin Mauve,et al.
A survey on position-based routing in mobile ad hoc networks
,
2001,
IEEE Netw..
[23]
J. P. Davis,et al.
Costs of Illness in the 1993 Waterborne Cryptosporidium Outbreak, Milwaukee, Wisconsin
,
2003,
Emerging infectious diseases.
[24]
Yan Huang,et al.
Exploiting Spatial Autocorrelation to Efficiently Process Correlation-Based Similarity Queries
,
2003,
SSTD.
[25]
Martin Skutella,et al.
Time-Expanded Graphs for Flow-Dependent Transit Times
,
2002,
ESA.