Building Content-Based Publish/Subscribe Systems with Distributed Hash Tables

Building distributed content–based publish/subscribe systems has remained a challenge. Existing solutions typically use a relatively small set of trusted computers as brokers, which may lead to scalability concerns for large Internet–scale workloads. Moreover, since each broker maintains state for a large number of users, it may be difficult to tolerate faults at each broker. In this paper we propose an approach to building content–based publish/subscribe systems on top of distributed hash table (DHT) systems. DHT systems have been effectively used for scalable and fault–tolerant resource lookup in large peer–to–peer networks. Our approach provides predicate–based query semantics and supports constrained range queries. Experimental evaluation shows that our approach is scalable to thousands of brokers, although proper tuning is required.

[1]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[2]  Jacob R. Lorch,et al.  Farsite: federated, available, and reliable storage for an incompletely trusted environment , 2002, OSDI '02.

[3]  Hans-Arno Jacobsen,et al.  Predicate matching and subscription matching in Publish/Subscribe systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[4]  Peter R. Pietzuch,et al.  Peer-to-peer overlay broker networks in an event-based middleware , 2003, DEBS '03.

[5]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[6]  Artur Andrzejak,et al.  Scalable, efficient range queries for grid information services , 2002, Proceedings. Second International Conference on Peer-to-Peer Computing,.

[7]  Alejandro P. Buchmann,et al.  A peer-to-peer approach to content-based publish/subscribe , 2003, DEBS '03.

[8]  Robert Tappan Morris,et al.  Ivy: a read/write peer-to-peer file system , 2002, OSDI '02.

[9]  Antony I. T. Rowstron,et al.  Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility , 2001, SOSP.

[10]  David R. Karger,et al.  Wide-area cooperative storage with CFS , 2001, SOSP.

[11]  Anoop Gupta,et al.  Query Processing Over Peer-To-Peer Data Sharing Systems , 2002 .

[12]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[13]  Peter Triantafillou,et al.  Subscription summaries for scalability and efficiency in publish/subscribe systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[14]  Michael B. Jones,et al.  SkipNet: A Scalable Overlay Network with Practical Locality Properties , 2003, USENIX Symposium on Internet Technologies and Systems.

[15]  David S. Rosenblum,et al.  Achieving scalability and expressiveness in an Internet-scale event notification service , 2000, PODC '00.

[16]  Miguel Castro,et al.  Scribe: a large-scale and decentralized application-level multicast infrastructure , 2002, IEEE J. Sel. Areas Commun..

[17]  Karl Aberer,et al.  Improving Data Access in P2P Systems , 2002, IEEE Internet Comput..

[18]  Hans-Arno Jacobsen,et al.  S-ToPSS: Semantic Toronto Publish/Subscribe System , 2003, VLDB.

[19]  Peter Druschel,et al.  Storage management and caching in PAST , 2001 .

[20]  Hans-Arno Jacobsen,et al.  A-TOPSS - A Publish/Subscribe System Supporting Approximate Matching , 2002, VLDB.

[21]  Divyakant Agrawal,et al.  Approximate Range Selection Queries in Peer-to-Peer Systems , 2003, CIDR.

[22]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.