A Generalized Algorithm for Publish/Subscribe Overlay Design and Its Fast Implementation

It is a challenging and fundamental problem to construct the underlying overlay network to support efficient and scalable information distribution in topic-based publish/subscribe systems. Existing overlay design algorithms aim to minimize the node fan-out while building topic-connected overlays, in which all nodes interested in the same topic are organized in a directly connected dissemination sub-overlay. However, most state-of-the-art algorithms suffer from high computational complexity, such as O(|V|4|T|), where V is the node set and T is the topic set. We devise a general indexing data structure that provides a significantly faster implementation, with O(|V|2|T|) running time, for different state-of-the-art algorithms. The generality of the indexing data structure is due to the fact that it enables edge lookup by both node degree and edge contribution, a central metric in all existing algorithms. When tested on typical pub/sub workloads, the speedup observed was by a factor of over 1 000, thereby rendering the algorithms more suitable for practical use. For example, under a typically Zipf distributed pub/sub workload, with 1 000 nodes and 100 topics, our new implementation completes in 3.823 seconds, while the previous alternative takes over 555 minutes.

[1]  Roberto Beraldi,et al.  Efficient Publish/Subscribe Through a Self-Organizing Broker Overlay and its Application to SIENA , 2007, Comput. J..

[2]  D. West Introduction to Graph Theory , 1995 .

[3]  Hans-Arno Jacobsen,et al.  Building Content-Based Publish/Subscribe Systems with Distributed Hash Tables , 2003, DBISP2P.

[4]  Miguel Castro,et al.  Scribe: a large-scale and decentralized application-level multicast infrastructure , 2002, IEEE J. Sel. Areas Commun..

[5]  Yoav Tock,et al.  SpiderCast: a scalable interest-aware overlay for topic-based pub/sub communication , 2007, DEBS '07.

[6]  Yoav Tock,et al.  Hierarchical Clustering of Message Flows in a Multicast Data Dissemination System , 2005, IASTED PDCS.

[7]  Pascal Felber,et al.  Semantic Peer-to-Peer Overlays for Publish/Subscribe Networks , 2005, Euro-Par.

[8]  Andréa W. Richa,et al.  Parameterized Maximum and Average Degree Approximation in Topic-Based Publish-Subscribe Overlay Network Design , 2010, ICDCS.

[9]  Helge Parzyjegla,et al.  Self-organizing broker topologies for publish/subscribe systems , 2007, SAC '07.

[10]  David R. Karger,et al.  Analysis of the evolution of peer-to-peer systems , 2002, PODC '02.

[11]  Emin Gün Sirer,et al.  Client behavior and feed characteristics of RSS, a publish-subscribe system for web micronews , 2005, IMC '05.

[12]  Yoav Tock,et al.  Constructing scalable overlays for pub-sub with many topics , 2007, PODC '07.

[13]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[14]  Yoav Tock,et al.  Magnet: practical subscription clustering for Internet-scale publish/subscribe , 2010, DEBS '10.

[15]  Hans-Arno Jacobsen,et al.  Adaptive Content-Based Routing in General Overlay Topologies , 2008, Middleware.

[16]  Hans-Arno Jacobsen,et al.  Divide and Conquer Algorithms for Publish/Subscribe Overlay Design , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[17]  Roberto Beraldi,et al.  TERA: topic-based event routing for peer-to-peer architectures , 2007, DEBS '07.

[18]  Michael Hoffmann,et al.  Algorithms - ESA 2007, 15th Annual European Symposium, Eilat, Israel, October 8-10, 2007, Proceedings , 2007, ESA.

[19]  Hans-Arno Jacobsen,et al.  A distributed service-oriented architecture for business process execution , 2010, TWEB.

[20]  Andréa W. Richa,et al.  Minimum Maximum Degree Publish-Subscribe Overlay Network Design , 2009, IEEE INFOCOM 2009.

[21]  Andréa W. Richa,et al.  Parameterized Maximum and Average Degree Approximation in Topic-Based Publish-Subscribe Overlay Network Design , 2009, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[22]  Patrick Th. Eugster,et al.  Data-aware multicast , 2004, International Conference on Dependable Systems and Networks, 2004.

[23]  Hans-Arno Jacobsen,et al.  G-ToPSS: fast filtering of graph-based metadata , 2005, WWW '05.

[24]  Luís E. T. Rodrigues,et al.  Scalable QoS-Based Event Routing in Publish-Subscribe Systems , 2005, Fourth IEEE International Symposium on Network Computing and Applications.

[25]  Fabrizio Grandoni,et al.  Fast Low Degree Connectivity of Ad-Hoc Networks Via Percolation , 2007, ESA.

[26]  Hans-Arno Jacobsen,et al.  Scaling Construction of Low Fan-out Overlays for Topic-Based Publish/Subscribe Systems , 2011, 2011 31st International Conference on Distributed Computing Systems.

[27]  Mohit Singh,et al.  Survivable network design with degree or order constraints , 2007, STOC '07.