ADAPTATION - Algorithms to Adaptive Fault Monitoring and their implementation on CORBA

This paper presents ADAPTATION - Algorithms to Adaptive Fault Monitoring for asynchronous distributed systems and their implementation on CORBA. Our algorithms vary the timeouts based on a recent history of last elapsed times of the monitoring messages. The aim of the proposed algorithms is to provide a better response time to crashes and a minimum discrepancy between a suspection due to the network overload and due to the real process crash. The proposed approach extends the Fault Tolerant CORBA OMG specification with the push model and the definition of pull and push ADAPTATION fault monitors. Some ADAPTATION experiments on ACE+TAO were made to observe their behavior on changing network workloads.

[1]  Edmundo Roberto Mauro Madeira,et al.  DPCP (Discard Past Consider Present)-a novel approach to adaptive fault detection in distributed systems , 2001, Proceedings Eighth IEEE Workshop on Future Trends of Distributed Computing Systems. FTDCS 2001.

[2]  Sam Toueg,et al.  Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[3]  Marcos K. Aguilera,et al.  Failure detection and consensus in the crash-recovery model , 2000, Distributed Computing.

[4]  Ralph Johnson,et al.  design patterns elements of reusable object oriented software , 2019 .

[5]  Aniruddha S. Gokhale,et al.  DOORS: towards high-performance fault tolerant CORBA , 2000, Proceedings DOA'00. International Symposium on Distributed Objects and Applications.

[6]  Campus de Ondina,et al.  Failure Detection in Asynchronous Distributed Systems , 2000 .

[7]  Priya Narasimhan,et al.  Replica consistency of CORBA objects in partitionable distributed systems , 1997, Distributed Syst. Eng..

[8]  Ivar Jacobson,et al.  The Unified Modeling Language User Guide , 1998, J. Database Manag..