Stochastic analysis and comparison of self-stabilizing routing algorithms for publish/subscribe systems

Publish/subscribe is becoming increasingly popular as it provides means for decoupled communication. One important issue for increasing the success of publish/subscribe middleware is to make them fault tolerant. Classical fault-tolerance mechanisms apply redundancy to mask certain faults. However, if a fault cannot be masked, it is not guaranteed that the system ever returns to normal operation. In contrast to that, self-stabilizing systems recover from arbitrary transient faults provided that faults do not continue to occur until the system is stable again. However, while the system stabilizes, it may not exhibit the desired behavior. In this paper, we present the first comprehensive analysis of publish/subscribe systems including self-stabilization, giving an alternative to extensive simulations. The analysis is based on continuous time birth-death Markov Chains and investigates the characteristics of publish/subscribe systems in equilibrium. We give closed analytical solutions for the sizes of routing tables, for the overhead required to keep the routing tables up-to-date, and for the leasing overhead required for self-stabilization. To judge the efficiency of self-stabilizing routing, we compare it to flooding which is the naive implementation of a self-stabilizing publish/subscribe system.

[1]  Felix C. Freiling,et al.  Evaluating advanced routing algorithms for content-based publish/subscribe systems , 2002, Proceedings. 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems.

[2]  Shlomi Dolev,et al.  Self Stabilization , 2004, J. Aerosp. Comput. Inf. Commun..

[3]  Alfonso Fuggetta,et al.  Analyzing the Behavior of Event Dispatching Systems through Simulation , 2000, HiPC.

[4]  Leonard Kleinrock,et al.  Queueing Systems: Volume I-Theory , 1975 .

[5]  Edsger W. Dijkstra,et al.  Self-stabilizing systems in spite of distributed control , 1974, CACM.

[6]  Zhenhui Shen Self-stabilizing routing in publish-subscribe systems , 2004, ICSE 2004.

[7]  Gero Mühl,et al.  Large-scale content based publish, subscribe systems , 2002 .

[8]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[9]  Ludger Fiege,et al.  Self-stabilizing Publish/Subscribe Systems: Algorithms and Evaluation , 2005, Euro-Par.

[10]  Paolo Costa,et al.  Epidemic algorithms for reliable content-based publish-subscribe: an evaluation , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..