Community Clustering for Distributed Publish/Subscribe Systems

Optimized placement of clients in a distributed publish/subscribe system is an important technique to improve overall system efficiency. Current methods, like interest clustering or publisher placement, treat a client as, either a pure publisher, or subscriber, but not as both. Also, the cost of client movement is usually ignored. However, many applications based on publish/subscribe systems model clients as publisher and subscriber at the same time, which breaks the assumptions made by current approaches. Considering the complex dependency among clients, we propose a new community-oriented clustering approach, based on the forming of client clusters that exhibit intense communication relationships, while keeping client movement cost low. The evaluation based on a public data set shows that our method is efficient, adapts to different settings of experimental conditions, and wins over the popular interest clustering approach with respect to number of messages sent, propagation hop count and end-to-end latency.

[1]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[2]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[3]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[4]  Leonardo Querzoni Interest clustering techniques for efficient event routing in large-scale settings , 2008, DEBS.

[5]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[6]  Reza Sherafat Kazemzadeh,et al.  Reliable and Highly Available Distributed Publish/Subscribe Service , 2009, 2009 28th IEEE International Symposium on Reliable Distributed Systems.

[7]  Hans-Arno Jacobsen,et al.  SLA-driven business process management in SOA , 2007, CASCON.

[8]  Roberto Beraldi,et al.  Efficient Publish/Subscribe Through a Self-Organizing Broker Overlay and its Application to SIENA , 2007, Comput. J..

[9]  M. Newman,et al.  Renormalization Group Analysis of the Small-World Network Model , 1999, cond-mat/9903357.

[10]  Hans-Arno Jacobsen,et al.  Distributed automatic service composition in large-scale systems , 2008, DEBS.

[11]  P. Arabie,et al.  An algorithm for clustering relational data with applications to social network analysis and comparison with multidimensional scaling , 1975 .

[12]  Yoav Tock,et al.  SpiderCast: a scalable interest-aware overlay for topic-based pub/sub communication , 2007, DEBS '07.

[13]  Guruduth Banavar,et al.  An efficient multicast protocol for content-based publish-subscribe systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[14]  Alex Pothen,et al.  PARTITIONING SPARSE MATRICES WITH EIGENVECTORS OF GRAPHS* , 1990 .

[15]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Tova Milo,et al.  Boosting topic-based publish-subscribe systems with dynamic clustering , 2007, SIGMOD '07.

[18]  Hans-Arno Jacobsen,et al.  Dynamic Load Balancing in Distributed Content-Based Publish/Subscribe , 2006, Middleware.

[19]  Walter Willinger,et al.  Understanding Internet topology: principles, models, and validation , 2005, IEEE/ACM Transactions on Networking.

[20]  Y. Charlie Hu,et al.  HYPER: A Hybrid Approach to Efficient Content-Based Publish/Subscribe , 2005, 25th IEEE International Conference on Distributed Computing Systems (ICDCS'05).

[21]  Karsten Schwan,et al.  Opportunistic Overlays: Efficient Content Delivery in Mobile Ad Hoc Networks , 2005, Middleware.

[22]  Hans-Arno Jacobsen,et al.  Publisher Placement Algorithms in Content-Based Publish/Subscribe , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[23]  Hans-Arno Jacobsen,et al.  A distributed service-oriented architecture for business process execution , 2010, TWEB.

[24]  Hans-Arno Jacobsen,et al.  Adaptive Content-Based Routing in General Overlay Topologies , 2008, Middleware.

[25]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[26]  Philip S. Yu,et al.  Clustering algorithms for content-based publication-subscription systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[27]  Soundar R. T. Kumara,et al.  Effective Web Service Composition in Diverse and Large-Scale Service Networks , 2008, IEEE Transactions on Services Computing.

[28]  Brian W. Kernighan,et al.  An efficient heuristic procedure for partitioning graphs , 1970, Bell Syst. Tech. J..

[29]  Feng Yu,et al.  Leveraging Distributed Publish/Subscribe Systems for Scalable Stream Query Processing , 2006, BIRTE.

[30]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[31]  Gero Mühl,et al.  Large-scale content based publish, subscribe systems , 2002 .

[32]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[33]  Marcos K. Aguilera,et al.  Matching events in a content-based subscription system , 1999, PODC '99.

[34]  Geoffrey C. Fox,et al.  NaradaBrokering: A Distributed Middleware Framework and Architecture for Enabling Durable Peer-to-Peer Grids , 2003, Middleware.

[35]  Hans-Arno Jacobsen,et al.  Transactional Mobility in Distributed Content-Based Publish/Subscribe Systems , 2009, 2009 29th IEEE International Conference on Distributed Computing Systems.