Latency and bandwidth-minimizing failure detectors

Failure detectors are fundamental building blocks in distributed systems. Multi-node failure detectors, where the detector is tasked with monitoring N other nodes, play a critical role in overlay networks and peer-to-peer systems. In such networks, failures need to be detected quickly and with low overhead. Achieving these properties simultaneously poses a difficult tradeoff between detection latency and resource consumption. In this paper, we examine this central tradeoff, formalize it as an optimization problem and analytically derive the optimal closed form formulas for multi-node failure detectors. We provide two variants of the optimal solution for optimality metrics appropriate for two different deployment scenarios. √s-LM is a latency-minimizing optimal failure detector that achieves the lowest average failure detection latency given a fixed bandwidth constraint for system maintenance. √s-BM is a bandwidth-minimizing optimal failure detector that meets a desired detection latency target with the least amount of bandwidth consumed. We evaluate our optimal results with node lifetimes chosen from bimodal and Pareto distributions, as well as real-world trace data from PlanetLab hosts, web sites and Microsoft PCs. Compared to standard failure detectors in wide use, √s failure detectors reduce failure detection latencies by 40% on average for the same bandwidth consumption, or conversely, reduce the amount of bandwidth consumed by 30% for the same failure detection latency.

[1]  Randy H. Katz,et al.  On failure detection algorithms in overlay networks , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[2]  Pierre Sens,et al.  Implementation and performance evaluation of an adaptable failure detector , 2002, Proceedings International Conference on Dependable Systems and Networks.

[3]  Marcos K. Aguilera,et al.  Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication , 1997, WDAG.

[4]  Naohiro Hayashibara,et al.  Two-ways Adaptive Failure Detection with the f Failure Detector , 2003 .

[5]  J. Kubiatowicz,et al.  Long-Term Data Maintenance in Wide-Area Storage Systems : A Quantitative Approach , 2005 .

[6]  Mikel Larrea,et al.  Optimal implementation of the weakest failure detector for solving consensus , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[7]  Emin Gün Sirer,et al.  Corona: A High Performance Publish-Subscribe System for the World Wide Web , 2006, NSDI.

[8]  Robbert van Renesse,et al.  A Gossip-Style Failure Detection Service , 2009 .

[9]  Gregor von Laszewski,et al.  A fault detection service for wide area distributed computations , 2004, Cluster Computing.

[10]  Mikel Larrea,et al.  Efficient Algorithms to Implement Unreliable Failure Detectors in Partially Synchronous Systems , 1999, DISC.

[11]  Marcos K. Aguilera,et al.  Failure detection and consensus in the crash-recovery model , 2000, Distributed Computing.

[12]  Emin Gün Sirer,et al.  The design and implementation of a next generation name service for the internet , 2004, SIGCOMM '04.

[13]  Marcos K. Aguilera,et al.  On the quality of service of failure detectors , 2000, Proceeding International Conference on Dependable Systems and Networks. DSN 2000.

[14]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[15]  Michael B. Jones,et al.  FUSE: Lightweight Guaranteed Distributed Failure Notification , 2004, OSDI.

[16]  John Kubiatowicz,et al.  Handling churn in a DHT , 2004 .

[17]  Mikel Larrea,et al.  Optimal implementation of the weakest failure detector for solving consensus , 2000, Proceedings 19th IEEE Symposium on Reliable Distributed Systems SRDS-2000.

[18]  Vern Paxson,et al.  End-to-end Internet packet dynamics , 1997, SIGCOMM '97.

[19]  Donald F. Towsley,et al.  Measurement and modelling of the temporal dependence in packet loss , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[20]  Krishna P. Gummadi,et al.  Measuring and analyzing the characteristics of Napster and Gnutella hosts , 2003, Multimedia Systems.

[21]  Robert Tappan Morris,et al.  Bandwidth-efficient management of DHT routing tables , 2005, NSDI.

[22]  Michel Raynal,et al.  An adaptive failure detection protocol , 2001, Proceedings 2001 Pacific Rim International Symposium on Dependable Computing.

[23]  Robbert van Renesse,et al.  Light-weight process groups in the Isis system , 1993, Distributed Syst. Eng..

[24]  Marvin Theimer,et al.  Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs , 2000, SIGMETRICS '00.

[25]  Miguel Castro,et al.  Performance and dependability of structured peer-to-peer overlays , 2004, International Conference on Dependable Systems and Networks, 2004.

[26]  Louise E. Moser,et al.  Totem: a fault-tolerant multicast group communication system , 1996, CACM.

[27]  Brighten Godfrey,et al.  Minimizing churn in distributed systems , 2006, SIGCOMM.

[28]  Mark Garland Hayden,et al.  The Ensemble System , 1998 .

[29]  Pierre Sens,et al.  Performance analysis of a hierarchical failure detector , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[30]  Yair Amir,et al.  Transis: A Communication Sub-system for High Availability , 1992 .

[31]  Péter Urbán,et al.  Definition and specification of accrual failure detectors , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[32]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[33]  Xavier Défago,et al.  Optimization techniques for replicating CORBA objects , 1999, 1999 Proceedings. Fourth International Workshop on Object-Oriented Real-Time Dependable Systems.

[34]  Venugopalan Ramasubramanian,et al.  Optimal Resource Utilization in Content Distribution Networks , 2005 .

[35]  Indranil Gupta,et al.  Gulfstream - a system for dynamic topology management in multi-domain server farms , 2001, Proceedings 42nd IEEE Symposium on Foundations of Computer Science.

[36]  Gregory R. Ganger,et al.  On Correlated Failures in Survivable Storage Systems , 2002 .

[37]  Indranil Gupta,et al.  Kelips: Building an Efficient and Stable P2P DHT through Increased Memory and Background Overhead , 2003, IPTPS.

[38]  Paul Francis,et al.  On Heterogeneous Overlay Construction and Random Node Selection in Unstructured P2P Networks , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[39]  Indranil Gupta,et al.  On scalable and efficient distributed failure detectors , 2001, PODC '01.