The effect of forgetting on the performance of a synchronizer

Abstract We study variants of the α -synchronizer by Awerbuch (1985) within a distributed message passing system with probabilistic message loss. The purpose of a synchronizer is to maintain a virtual (lock-step) round structure, which simplifies the design of higher-level distributed algorithms. The underlying idea of an α -synchronizer is to let processes continuously exchange round numbers and to allow a process to proceed to the next round only after it has witnessed that all processes have already started the current round. In this work, we study the performance of several synchronizers in an environment with probabilistic message loss. In particular, we analyze how different strategies of forgetting affect the round durations. The synchronizer variants considered differ in the times when processes discard part of their accumulated knowledge during the execution. Possible applications can be found, e.g., in sensor fusion, where sensor data become outdated and thus invalid after a certain amount of time. For all synchronizer variants considered, we develop corresponding Markov chain models and quantify the performance degradation using both analytic approaches and Monte-Carlo simulations. Our results allow to explicitly calculate the asymptotic behavior of the round durations: While in systems with very reliable communication the effect of forgetting is negligible, the effect is more profound in systems with less reliable communication. Our study thus provides computationally efficient bounds on the performance of the (non-forgetting) α -synchronizer and allows to quantitatively assess the effect accumulated knowledge has on the performance.

[1]  Matthias Függer,et al.  The Effect of Forgetting on the Performance of a Synchronizer , 2013, ALGOSENSORS.

[2]  Sergio Rajsbaum Upper and Lower Bounds for Stochastic Marked Graphs , 1994, Inf. Process. Lett..

[3]  Ulrich Schmid,et al.  How to reconcile fault-tolerant interval intersection with the Lipschitz condition , 2001, Distributed Computing.

[4]  Matthias Függer,et al.  On the Performance of a Retransmission-Based Synchronizer , 2011, SIROCCO.

[5]  George Varghese,et al.  Crash failures can drive protocols to arbitrary states , 1996, PODC '96.

[6]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[7]  Helmut Prodinger,et al.  A result in order statistics related to probabilistic counting , 1993, Computing.

[8]  Bernadette Charron-Bost,et al.  Crash Failures vs. Crash + Link Failures (Abstract). , 1996, ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing.

[9]  Ulrich Schmid,et al.  The Theta-Model: achieving synchrony without clocks , 2009, Distributed Computing.

[10]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[11]  Joseph Y. Halpern USING REASONING ABOUT KNOWLEDGE TO ANALYZE DISTRIBUTED SYSTEMS , 1987 .

[12]  Wojciech Szpankowski,et al.  Yet another application of a binomial recurrence order statistics , 1990, Computing.

[13]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[14]  Bernadette Charron-Bost,et al.  Crash failures vs. crash + link failures , 1996, PODC '96.

[15]  Keith Marzullo,et al.  Tolerating failures of continuous-valued sensors , 1990, TOCS.

[16]  Ronald Fagin,et al.  Reasoning about knowledge and probability , 1988, JACM.

[17]  W. Szpankowski,et al.  Yet Another Application of a Binomial Recurrence , 1988 .

[18]  Ronald L. Graham,et al.  Concrete Mathematics, a Foundation for Computer Science , 1991, The Mathematical Gazette.

[19]  Baruch Awerbuch,et al.  Complexity of network synchronization , 1985, JACM.

[20]  Shlomi Dolev,et al.  Self Stabilization , 2004, J. Aerosp. Comput. Inf. Commun..

[21]  Ronald Fagin,et al.  Reasoning about knowledge , 1995 .

[22]  J. Propp,et al.  Exact sampling with coupled Markov chains and applications to statistical mechanics , 1996 .

[23]  Eduardo F. Nakamura,et al.  Information fusion for wireless sensor networks: Methods, models, and classifications , 2007, CSUR.

[24]  Moshe Sidi,et al.  On the Performance of Synchronized Programs in Distributed Networks with Random Processing Times and Transmission Delays , 1994, IEEE Trans. Parallel Distributed Syst..