In-network fault tolerance in networked sensor systems

Networked sensor systems (NSS) have the potential of significantly enhancing our ability to monitor and interact with our physical environment. For example, numerous tiny sensors may be spread all over a forest to reveal the forest's life and events in detail along the time-space continuum. Critical to the success of NSS is the reliable collection and dissemination of data while conserving sensors' limited resources, in particular energy. In this paper, we propose a novel scheme that provides fault-tolerant data collection and dissemination through data checkpointing and recovery. We assume that the data is periodically reported from a sink node to the end users. Based on a model-based simulation, we show that our scheme is highly resilient to sensor and sink failure. It also leads to enhanced NSS lifetime and data collection with minimal overhead.

[1]  Deborah Estrin,et al.  Data-Centric Storage in Sensornets with GHT, a Geographic Hash Table , 2003, Mob. Networks Appl..

[2]  S. Sitharama Iyengar,et al.  Functional characterization of fault tolerant integration in distributed sensor networks , 1991, IEEE Trans. Syst. Man Cybern..

[3]  Keith Marzullo,et al.  Tolerating failures of continuous-valued sensors , 1990, TOCS.

[4]  Chenxi Zhu,et al.  QoS routing for mobile ad hoc networks , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[5]  James S. Plank An Overview of Checkpointing in Uniprocessor and Distributed Systems, Focusing on Implementation and , 1997 .

[6]  Mohamed F. Younis,et al.  On handling QoS traffic in wireless sensor networks , 2004, 37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the.

[7]  M. Ishizuka,et al.  Performance study of node placement in sensor networks , 2004, 24th International Conference on Distributed Computing Systems Workshops, 2004. Proceedings..

[8]  Chunhung Richar,et al.  On-Demand QoS Routing in Multihop Mobile Networks† , 2001 .

[9]  Mohamed F. Younis,et al.  Safe base-station repositioning in wireless sensor networks , 2006, 2006 IEEE International Performance Computing and Communications Conference.

[10]  William H. Sanders,et al.  The Mobius modeling tool , 2001, Proceedings 9th International Workshop on Petri Nets and Performance Models.

[11]  Nitin H. Vaidya,et al.  On Checkpoint Latency , 1995 .

[12]  Ian F. Akyildiz,et al.  Wireless sensor networks: a survey , 2002, Comput. Networks.

[13]  D. N. Jayasimha Fault tolerance in multisensor networks , 1996, IEEE Trans. Reliab..

[14]  Chun-Di Mu,et al.  An efficient algorithm for fault tolerance in multisensor networks , 2004, Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826).

[15]  L. Alvisi,et al.  A Survey of Rollback-Recovery Protocols , 2002 .

[16]  Klara Nahrstedt,et al.  Distributed quality-of-service routing in ad hoc networks , 1999, IEEE J. Sel. Areas Commun..

[17]  Miodrag Potkonjak,et al.  Fault Tolerance in Wireless Ad-Hoc Sensor Networks , 2007 .

[18]  Roy Friedman,et al.  Evaluating distributed checkpointing protocols , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..