论文信息 - Partition-Tolerant Distributed Publish/Subscribe Systems

Partition-Tolerant Distributed Publish/Subscribe Systems

In this paper, we develop reliable distributed publish/subscribe algorithms that can tolerate concurrent failure of up to d broker machines or communication links. In our approach, d is a configuration parameter which determines the level of fault-tolerance of the system and reliability refers to exactly-once and per-source, in-order delivery of publications to clients with matching subscriptions. We propose protocols to address three problems in presence of broker or link failures: (i) subscription propagation, (ii) publication forwarding, and (iii) broker recovery. Finally, we study the effectiveness of our approach when the number of concurrent failures exceeds d. Through large-scale experimental evaluations with up to 500 brokers, we demonstrate that a system configured with a modest value of d = 3 is able to reliably deliver 97% of publications in presence of failure of up to 17% of its brokers.

Reza Sherafat Kazemzadeh | Hans-Arno Jacobsen

[1] Yuanyuan Zhao,et al. Subscription Propagation and Content-Based Routing with Delivery Guarantees , 2005, DISC.

[2] David S. Rosenblum,et al. Design and evaluation of a wide-area event notification service , 2001, TOCS.

[3] Sam Toueg,et al. Unreliable failure detectors for reliable distributed systems , 1996, JACM.

[4] Hans-Arno Jacobsen,et al. The PADRES Distributed Publish/Subscribe System , 2005, FIW.

[5] Hans-Arno Jacobsen,et al. Load Balancing Content-Based Publish/Subscribe Systems , 2010, TOCS.

[6] Hans-Arno Jacobsen,et al. Dynamic Load Balancing in Distributed Content-Based Publish/Subscribe , 2006, Middleware.

[7] Alfonso Fuggetta,et al. The JEDI Event-Based Infrastructure and Its Application to the Development of the OPSS WFMS , 2001, IEEE Trans. Software Eng..

[8] Hans-Arno Jacobsen,et al. A distributed service-oriented architecture for business process execution , 2010, TWEB.

[9] Pascal Felber,et al. XNET: a reliable content-based publish/subscribe system , 2004, Proceedings of the 23rd IEEE International Symposium on Reliable Distributed Systems, 2004..

[10] Peter R. Pietzuch,et al. Hermes: a distributed event-based middleware architecture , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[11] Hari Balakrishnan,et al. Resilient overlay networks , 2001, SOSP.

[12] Amy L. Murphy,et al. Minimizing the reconfiguration overhead in content-based publish-subscribe , 2004, SAC '04.

[13] Amy L. Murphy,et al. Efficient content-based event dispatching in the presence of topological reconfiguration , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..

[14] Alex C. Snoeren,et al. Mesh-based content routing using XML , 2001, SOSP.

[15] Saurabh Bagchi,et al. Exactly-once delivery in a content-based publish-subscribe system , 2002, Proceedings International Conference on Dependable Systems and Networks.

[16] Reza Sherafat Kazemzadeh,et al. Reliable and Highly Available Distributed Publish/Subscribe Service , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[17] Joshua S. Auerbach,et al. Exploiting IP Multicast in Content-Based Publish-Subscribe Systems , 2000, Middleware.

[18] Matt Welsh,et al. Cobra: Content-based Filtering and Aggregation of Blogs and RSS Feeds , 2007, NSDI.

[19] David R. Cheriton,et al. OTERS (on-tree efficient recovery using subcasting): a reliable multicast protocol , 1998, Proceedings Sixth International Conference on Network Protocols (Cat. No.98TB100256).

[20] Hans-Arno Jacobsen,et al. Efficient event processing through reconfigurable hardware for algorithmic trading , 2010, Proc. VLDB Endow..