Towards a dependable architecture for internetscale

The convergence of embedded sensors and pervasive high-performance networking is giving rise to a new class of distributed applications, which we refer to as Internet-scale sensing (ISS). ISS systems consist of a large number of geographically distributed data sources tied into a framework for collecting, filtering, and processing potentially large volumes of real-time data. In this paper, we discuss the issues involved in building dependable ISS systems. ISS systems differ from conventional distributed systems in a number of respects, including the number of data sources, differing data quality requirements, and necessity to continue operating despite intermittent link and node failures. Such failures should result in graceful degradation of the quality of the results returned by the system, rather than loss of results. In this paper, we argue that conventional approaches to achieving consistency do not scale to the requirements of ISS systems. We outline a lightweight approach to dependability based on a set of metrics that reflect on the quality of the answers returned by the system. We argue that answers returned by an ISS system should include a measure of the harvest and freshness of the data sources participating in the result, and these metrics in turn can be used to drive fault-tolerance mechanisms in the system. We also propose three simple techniques to achieve scalability and graceful degradation in the face of failure.

[1]  Deborah Estrin,et al.  Preprocessing in a Tiered Sensor Network for Habitat Monitoring , 2003, EURASIP J. Adv. Signal Process..

[2]  Robbert van Renesse,et al.  Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining , 2003, TOCS.

[3]  Jim Gray,et al.  Fault Tolerance in Tandem Computer Systems , 1987 .

[4]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[5]  Matt Welsh,et al.  Deploying a wireless sensor network on an active volcano , 2006, IEEE Internet Computing.

[6]  Srinivasan Seshan,et al.  IrisNet: an internet-scale architecture for multimedia sensors , 2005, MULTIMEDIA '05.

[7]  Scott Shenker,et al.  The Architecture of PIER: an Internet-Scale Query Processor , 2005, CIDR.

[8]  Jennifer Widom,et al.  Offering a Precision-Performance Tradeoff for Aggregation Queries over Replicated Data , 2000, VLDB.

[9]  John Anderson,et al.  An analysis of a large scale habitat monitoring application , 2004, SenSys '04.

[10]  Frederick Reiss,et al.  HiFi: A Unified Architecture for High Fan-in Systems , 2004, VLDB.

[11]  Stephen Kent Sloan Digital Sky Survey , 1994 .

[12]  Marvin Theimer,et al.  Flexible update propagation for weakly consistent replication , 1997, SOSP.

[13]  Michael Stonebraker,et al.  Fault-tolerance in the Borealis distributed stream processing system , 2005, SIGMOD '05.

[14]  Amin Vahdat,et al.  The costs and limits of availability for replicated services , 2001, TOCS.

[15]  Margo I. Seltzer,et al.  Network-Aware Operator Placement for Stream-Processing Systems , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[16]  Gyula Simon,et al.  Sensor network-based countersniper system , 2004, SenSys '04.

[17]  Vishal Malik,et al.  Distributed intrusion detection system , 2002 .

[18]  Wei Hong,et al.  A macroscope in the redwoods , 2005, SenSys '05.

[19]  Michael Vrable,et al.  Scalability, fidelity, and containment in the potemkin virtual honeyfarm , 2005, SOSP '05.

[20]  B. Karp,et al.  Autograph: Toward Automated, Distributed Worm Signature Detection , 2004, USENIX Security Symposium.