Dependable Publish/Subscribe Systems for Distributed Application Development

The publish/subscribe (pub/sub) communication paradigm provides an asynchronous many-to-many communication substrate between data producers and data consumers. Using pub/sub, components of a distributed application can communicate without direct knowledge of each other which facilitates loose-coupling and scalability. Therefore, providing a managed pub/sub service can reduce the development and operational effort of Internet-scaled distributed applications. In this work, we address four non-functional requirements of distributed content-based pub/sub systems that can facilitate their adoption as the basis of a dependable pub/sub service suitable for distributed application development. Firstly, we address availability of a pub/sub service during broker failure. Broker failures can cause delivery disruption and therefore, a repair mechanism is required, along with message retransmission to prevent message loss. During repair and recovery, the latency of message delivery can temporarily increase. To address this problem, we present an epidemic protocol to allow a content-based pub/sub system to keep delivering messages with low latency, while failed brokers are recovering. Using a broker similarity metric, which takes into account the content space and the overlay topology, we control and direct gossip messages around failed brokers. Based on our evaluation, our approach is able to provide a higher message delivery ratio than the deterministic alternative at high failure rates or when broker failures follow a non-uniform distribution. Secondly, we address scalability of the hop-by-hop routing mechanism utilized in such distributed pub/sub systems. In an Internet-scale pub/sub service, this routing scheme allows brokers to correctly forward messages without requiring global knowledge. However, this model causes brokers to forward publications without knowing the volume and distance of matching subscribers, which can result in inefficient resource utilization. In order to raise the scalability of the service, we introduce a popularity-based routing mechanism. We define a utilization metric to measure the impact of forwarding a publication on the overall delivery of the system. Furthermore, we propose a new publication routing algorithm that takes into account broker resources and publication popularity among subscribers. Lastly, we propose three approaches to handle unpopular publications. Based on our evaluation, using

[1]  Anne-Marie Kermarrec,et al.  Gossiping in distributed systems , 2007, OPSR.

[2]  Ehud Gudes,et al.  Transactions in Content-Based Publish/Subscribe Middleware , 2007, 27th International Conference on Distributed Computing Systems Workshops (ICDCSW'07).

[3]  Reza Sherafat Kazemzadeh Overlay Neighborhoods for Distributed Publish/Subscribe Systems , 2013 .

[4]  Kurt Rothermel,et al.  PLEROMA: a SDN-based high performance publish/subscribe middleware , 2014, Middleware.

[5]  Fernando Pedone,et al.  Streamline: An Architecture for Overlay Multicast , 2009, 2009 Eighth IEEE International Symposium on Network Computing and Applications.

[6]  Helge Parzyjegla,et al.  Reconfiguring Publish/Subscribe Overlay Topologies , 2006, 26th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW'06).

[7]  Patrick Th. Eugster,et al.  Parametric Subscriptions for Content-Based Publish/Subscribe Networks , 2010, Middleware.

[8]  Hans-Arno Jacobsen,et al.  Algorithms Based on Divide and Conquer for Topic-Based Publish/Subscribe Overlay Design , 2016, IEEE/ACM Transactions on Networking.

[9]  John N. Hooker,et al.  A quantitative approach to logical inference , 1988, Decis. Support Syst..

[10]  John C. Knight,et al.  Publish and Subscribe with Reply , 2002 .

[11]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[12]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX ATC.

[13]  Amir H. Payberah,et al.  Vitis: A Gossip-based Hybrid Overlay for Internet-scale Publish/Subscribe Enabling Rendezvous Routing in Unstructured Overlay Networks , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[14]  Jun Wei,et al.  A New Approach for Overload Management in Content-based Publish/Subscribe , 2007, International Conference on Software Engineering Advances (ICSEA 2007).

[15]  Hans-Arno Jacobsen,et al.  Content-based routing in mobile ad hoc networks , 2005, The Second Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services.

[16]  Jun Li,et al.  Wormhole: Reliable Pub-Sub to Support Geo-replicated Internet Services , 2015, NSDI.

[17]  Kaiwen Zhang,et al.  PopSub: Improving Resource Utilization in Distributed Content-based Publish/Subscribe Systems , 2017, DEBS.

[18]  Aniruddha S. Gokhale,et al.  Reliable publish/subscribe middleware for time-sensitive internet-scale applications , 2009, DEBS '09.

[19]  Hans-Arno Jacobsen,et al.  Minimal broker overlay design for content-based publish/subscribe systems , 2013, CASCON.

[20]  Patrick Th. Eugster,et al.  Aggregation for implicit invocations , 2013, AOSD.

[21]  G. Pardo-Castellote,et al.  OMG data distribution service: architectural overview , 2003, IEEE Military Communications Conference, 2003. MILCOM 2003..

[22]  Márk Jelasity,et al.  T-Man: Gossip-based fast overlay topology construction , 2009, Comput. Networks.

[23]  Roberto Beraldi,et al.  TERA: topic-based event routing for peer-to-peer architectures , 2007, DEBS '07.

[24]  Roberto Baldoni,et al.  Content-Based Publish-Subscribe over Structured Overlay Networks , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[25]  Maarten van Steen,et al.  PolderCast: Fast, Robust, and Scalable Architecture for P2P Topic-Based Pub/Sub , 2012, Middleware.

[26]  João Leitão,et al.  RASM: A Reliable Algorithm for Scalable Multicast , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[27]  Indranil Gupta,et al.  On scalable and efficient distributed failure detectors , 2001, PODC '01.

[28]  Anne-Marie Kermarrec,et al.  Ordered Slicing of Very Large-Scale Overlay Networks , 2006, Sixth IEEE International Conference on Peer-to-Peer Computing (P2P'06).

[29]  Patrick Th. Eugster,et al.  Atmosphere: A Universal Cross-Cloud Communication Infrastructure , 2013, Middleware.

[30]  Hans-Arno Jacobsen,et al.  Composite Subscriptions in Content-Based Publish/Subscribe Systems , 2005, Middleware.

[31]  Anne-Marie Kermarrec,et al.  The many faces of publish/subscribe , 2003, CSUR.

[32]  Yoav Tock,et al.  Constructing scalable overlays for pub-sub with many topics , 2007, PODC '07.

[33]  Sérgio Duarte,et al.  Routing algorithms for content-based publish/subscribe systems , 2010, IEEE Communications Surveys & Tutorials.

[34]  Stanford,et al.  Learning to Discover Social Circles in Ego Networks , 2012 .

[35]  Theo Schlossnagle Distributed Systems, Like It or Not , 2017 .

[36]  Andréa W. Richa,et al.  Minimum Maximum Degree Publish-Subscribe Overlay Network Design , 2009, IEEE INFOCOM 2009.

[37]  Jie Wu,et al.  On the construction of the minimum cost content-based publish/subscribe overlays , 2011, 2011 8th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks.

[38]  Fabián E. Bustamante,et al.  A Comparison of Resilient Overlay Multicast Approaches , 2007, IEEE Journal on Selected Areas in Communications.

[39]  Anne-Marie Kermarrec,et al.  Efficient and adaptive epidemic-style protocols for reliable and scalable multicast , 2006, IEEE Transactions on Parallel and Distributed Systems.

[40]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[41]  Roberto Beraldi,et al.  Efficient Publish/Subscribe Through a Self-Organizing Broker Overlay and its Application to SIENA , 2007, Comput. J..

[42]  Yousof Al-Hammadi,et al.  The evolution of distributed systems towards microservices architecture , 2016, 2016 11th International Conference for Internet Technology and Secured Transactions (ICITST).

[43]  Doug Terry,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[44]  Jay Kreps,et al.  Kafka : a Distributed Messaging System for Log Processing , 2011 .

[45]  Ming Li,et al.  A Scalable and Elastic Publish/Subscribe Service , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[46]  Hans-Arno Jacobsen,et al.  The PADRES Distributed Publish/Subscribe System , 2005, FIW.

[47]  Craig A. Knoblock,et al.  PDDL-the planning domain definition language , 1998 .

[48]  Sheila A. McIlraith,et al.  Planning the transformation of overlays , 2016, SAC.

[49]  Kang Lee,et al.  IEEE 1588 standard for a precision clock synchronization protocol for networked measurement and control systems , 2002, 2nd ISA/IEEE Sensors for Industry Conference,.

[50]  Idit Keidar,et al.  Modular Composition of Coordination Services , 2016, USENIX Annual Technical Conference.

[51]  R. Preston McAfee,et al.  Usage Patterns and the Economics of the Public Cloud , 2017, WWW.

[52]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[53]  Hans-Arno Jacobsen,et al.  Foundations for Highly Available Content-Based Publish/Subscribe Overlays , 2011, 2011 31st International Conference on Distributed Computing Systems.

[54]  Ming Zhou,et al.  Tree-assisted gossiping for overlay video distribution , 2006, Multimedia Tools and Applications.

[55]  Hans-Arno Jacobsen,et al.  Publisher mobility in distributed publish/subscribe systems , 2005, 25th IEEE International Conference on Distributed Computing Systems Workshops.

[56]  Hans-Arno Jacobsen,et al.  Adaptive Content-Based Routing in General Overlay Topologies , 2008, Middleware.

[57]  Krishna P. Gummadi,et al.  King: estimating latency between arbitrary internet end hosts , 2002, IMW '02.

[58]  Emin Gün Sirer,et al.  Client behavior and feed characteristics of RSS, a publish-subscribe system for web micronews , 2005, IMC '05.

[59]  Christof Fetzer,et al.  Handling Overload in Publish/Subscribe Systems , 2006, 26th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW'06).

[60]  Elisabetta Di Nitto,et al.  Reconfiguration Primitives for Self-Adapting Overlays in Distributed Publish-Subscribe Systems , 2012, 2012 IEEE Sixth International Conference on Self-Adaptive and Self-Organizing Systems.

[61]  Young Yoon Adaptation Techniques for Publish/Subscribe Overlays , 2013 .

[62]  Jacek Gondzio,et al.  Warm start of the primal-dual method applied in the cutting-plane scheme , 1998, Math. Program..

[63]  Brian Randell,et al.  Fundamental Concepts of Dependability , 2000 .

[64]  Reza Sherafat Kazemzadeh,et al.  Reliable and Highly Available Distributed Publish/Subscribe Service , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[65]  Felix C. Freiling,et al.  Supporting Mobility in Content-Based Publish/Subscribe Middleware , 2003, Middleware.

[66]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[67]  Paolo Traverso,et al.  Automated Planning: Theory & Practice , 2004 .

[68]  Suresh Rai,et al.  Topology Design with Minimal Cost Subject to Network Reliability Constraint , 2015, IEEE Transactions on Reliability.

[69]  Matteo Migliavacca,et al.  On adding replies to publish-subscribe , 2007, DEBS '07.

[70]  João Leitão,et al.  X-BOT: A Protocol for Resilient Optimization of Unstructured Overlay Networks , 2012, IEEE Transactions on Parallel and Distributed Systems.

[71]  Anne-Marie Kermarrec,et al.  Lightweight probabilistic broadcast , 2003, TOCS.

[72]  Beth Plale,et al.  Survey of Publish Subscribe Event Systems , 2003 .

[73]  Hans-Arno Jacobsen,et al.  Highly-available content-based publish/subscribe via gossiping , 2016, DEBS.

[74]  Luís E. T. Rodrigues,et al.  Scalable QoS-Based Event Routing in Publish-Subscribe Systems , 2005, Fourth IEEE International Symposium on Network Computing and Applications.

[75]  Dave Levin,et al.  PeerWise Discovery and Negotiation of Faster Paths , 2007, HotNets.

[76]  Kenneth P. Birman,et al.  Bimodal multicast , 1999, TOCS.

[77]  Hans-Arno Jacobsen,et al.  A Generalized Algorithm for Publish/Subscribe Overlay Design and Its Fast Implementation , 2012, DISC.

[78]  Alfonso Fuggetta,et al.  The JEDI Event-Based Infrastructure and Its Application to the Development of the OPSS WFMS , 2001, IEEE Trans. Software Eng..

[79]  Anne-Marie Kermarrec,et al.  The Peer Sampling Service: Experimental Evaluation of Unstructured Gossip-Based Implementations , 2004, Middleware.

[80]  S. Kambhampati,et al.  Optiplan: Unifying IP-based and Graph-based Planning , 2005, J. Artif. Intell. Res..

[81]  Menkes van den Briel Integer programming approaches for automated planning , 2008 .

[82]  Christophe Diot,et al.  Deployment issues for the IP multicast service and architecture , 2000, IEEE Netw..

[83]  Paolo Costa,et al.  Epidemic algorithms for reliable content-based publish-subscribe: an evaluation , 2004, 24th International Conference on Distributed Computing Systems, 2004. Proceedings..

[84]  Shervin Shirmohammadi,et al.  A survey of application-layer multicast protocols , 2007, IEEE Communications Surveys & Tutorials.

[85]  Scott Shenker,et al.  Revisiting IP multicast , 2006, SIGCOMM 2006.

[86]  Yoav Tock,et al.  SpiderCast: a scalable interest-aware overlay for topic-based pub/sub communication , 2007, DEBS '07.

[87]  Kaiwen Zhang,et al.  Incremental Topology Transformation for Publish/Subscribe Systems Using Integer Programming , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[88]  Idit Keidar,et al.  Group communication specifications: a comprehensive study , 2001, CSUR.

[89]  Keith Marzullo,et al.  Gossip versus Deterministically Constrained Flooding on Small Networks , 2000, DISC.

[90]  Ajay Mohindra,et al.  Building scalable, secure, multi-tenant cloud services on IBM Bluemix , 2016, IBM J. Res. Dev..

[91]  Chunqiang Tang,et al.  GoCast: gossip-enhanced overlay multicast for fast and dependable group communication , 2005, 2005 International Conference on Dependable Systems and Networks (DSN'05).

[92]  Peter R. Pietzuch,et al.  Congestion Control in a Reliable Scalable Message-Oriented Middleware , 2003, Middleware.

[93]  Anne-Marie Kermarrec,et al.  Sub-2-Sub: Self-Organizing Content-Based Publish Subscribe for Dynamic Large Scale Collaborative Networks , 2006, IPTPS.

[94]  Roberto Baldoni,et al.  Modeling publish/subscribe communication systems: towards a formal approach , 2003, Proceedings of the Eighth International Workshop on Object-Oriented Real-Time Dependable Systems, 2003. (WORDS 2003)..

[95]  José Pereira,et al.  StAN: exploiting shared interests without disclosing them in gossip-based publish/subscribe , 2010, IPTPS.

[96]  Silvia Richter,et al.  The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks , 2010, J. Artif. Intell. Res..

[97]  Mostafa H. Ammar,et al.  Dynamic Topology Configuration in Service Overlay Networks: A Study of Reconfiguration Policies , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[98]  Robbert van Renesse,et al.  A Gossip-Style Failure Detection Service , 2009 .

[99]  Israel Ben-Shaul,et al.  Dynamic Self Adaptation in Distributed Systems , 2000, IWSAS.

[100]  Reza Sherafat Kazemzadeh,et al.  Opportunistic Multipath Forwarding in Content-Based Publish/Subscribe Overlays , 2012, Middleware.

[101]  François Taïani,et al.  Generalised Repair for Overlay Networks , 2006, 2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06).

[102]  Pieter Hintjens,et al.  ZeroMQ: Messaging for Many Applications , 2013 .

[103]  Bobby Bhattacharjee,et al.  Scalable application layer multicast , 2002, SIGCOMM '02.

[104]  Kaiwen Zhang,et al.  Multi-Client Transactions in Distributed Publish/Subscribe Systems , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[105]  Hans-Arno Jacobsen,et al.  On Delivery Guarantees in Distributed Content-Based Publish/Subscribe Systems , 2020, Middleware.

[106]  Paolo Costa,et al.  Semi-Probabilistic Content-Based Publish-Subscribe , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[107]  Yoav Tock,et al.  Weighted Overlay Design for Topic-Based Publish/Subscribe Systems on Geo-Distributed Data Centers , 2015, 2015 IEEE 35th International Conference on Distributed Computing Systems.

[108]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[109]  Anne-Marie Kermarrec,et al.  Probabilistic Reliable Dissemination in Large-Scale Systems , 2003, IEEE Trans. Parallel Distributed Syst..

[110]  Hans-Arno Jacobsen,et al.  A Unified Approach to Routing, Covering and Merging in Publish/Subscribe Systems Based on Modified Binary Decision Diagrams , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[111]  Helge Parzyjegla,et al.  Self-organizing broker topologies for publish/subscribe systems , 2007, SAC '07.

[112]  Paolo Bellavista,et al.  Quality of Service in Wide Scale Publish—Subscribe Systems , 2014, IEEE Communications Surveys & Tutorials.

[113]  Hector Geffner,et al.  Searching for Plans with Carefully Designed Probes , 2011, ICAPS.

[114]  Nalini Venkatasubramanian,et al.  GSFord: Towards a Reliable Geo-social Notification System , 2012, 2012 IEEE 31st Symposium on Reliable Distributed Systems.

[115]  Alejandro P. Buchmann,et al.  A peer-to-peer approach to content-based publish/subscribe , 2003, DEBS '03.

[116]  Xin Wu,et al.  zUpdate: updating data center networks with zero loss , 2013, SIGCOMM.

[117]  Alan D. George,et al.  GEMS: Gossip-Enabled Monitoring Service for Scalable Heterogeneous Distributed Systems , 2006, Cluster Computing.

[118]  Miguel Castro,et al.  Scribe: a large-scale and decentralized application-level multicast infrastructure , 2002, IEEE J. Sel. Areas Commun..