Summary-based routing for content-based event distribution networks

Providing scalable distributed Web-based eventing services has been an important research topic. It is desirable to have an effective mechanism for the servers to summarize their filters for in-network preprocessing in order to optimize system performance. In this paper, we propose a summary-based routing mechanism and introduce the notion of imprecise summaries to provide a trade-off between routing overhead and event traffic. Our system uses similarity-based filter clustering to reduce overall event traffic and performs self-tuning summary precision selection to optimize throughput. We have implemented summary-based routing on top of an XML-based infrastructure that closely follows the proposed Web services standards. Measurements from the actual implementation validate our analytical and simulation results, and demonstrate the practical benefits of the proposed techniques.

[1]  Deborah Estrin,et al.  Directed diffusion: a scalable and robust communication paradigm for sensor networks , 2000, MobiCom '00.

[2]  Robert E. Gruber,et al.  The architecture of the READY event notification service , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems. Workshops on Electronic Commerce and Web-based Applications. Middleware.

[3]  Alex C. Snoeren,et al.  Mesh-based content routing using XML , 2001, SOSP.

[4]  Helen J. Wang,et al.  Subscription Partitioning and Routing in Content-based Publish/Subscribe Systems , 2005 .

[5]  Mark Handley,et al.  Topologically-aware overlay construction and server selection , 2002, Proceedings.Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies.

[6]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[7]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[8]  John Edward Gough,et al.  Efficient Recognition of Events in a Distributed System , 1995 .

[9]  Li Fan,et al.  Web caching and Zipf-like distributions: evidence and implications , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[10]  Li Fan,et al.  Summary cache: a scalable wide-area web cache sharing protocol , 2000, TNET.

[11]  Steven McCanne,et al.  An evaluation of preference clustering in large-scale multicast applications , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[12]  Krishna Bharat,et al.  Supporting cooperative and personal surfing with a desktop assistant , 1997, UIST '97.

[13]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[14]  Hector Garcia-Molina,et al.  The SIFT information dissemination system , 1999, TODS.

[15]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[16]  Bruce Hendrickson,et al.  A Multi-Level Algorithm For Partitioning Graphs , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[17]  Alexander L. Wolf,et al.  Fast Forwarding for Content-Based Networking , 2001 .

[18]  Dennis Shasha,et al.  Filtering algorithms and implementation for very fast publish/subscribe systems , 2001, SIGMOD '01.

[19]  Paramvir Bahl,et al.  Characterizing Alert and Browse Services of Mobile Clients , 2002, USENIX Annual Technical Conference, General Track.

[20]  Paramvir Bahl,et al.  The SIMBA user alert service architecture for dependable alert delivery , 2001, 2001 International Conference on Dependable Systems and Networks.

[21]  Joshua S. Auerbach,et al.  Exploiting IP Multicast in Content-Based Publish-Subscribe Systems , 2000, Middleware.

[22]  Miguel Castro,et al.  SCRIBE: The Design of a Large-Scale Event Notification Infrastructure , 2001, Networked Group Communication.

[23]  Marcos K. Aguilera,et al.  Matching events in a content-based subscription system , 1999, PODC '99.

[24]  Laurence A. Wolsey,et al.  The node capacitated graph partitioning problem: A computational study , 1998, Math. Program..

[25]  R. Bayer,et al.  Organization and maintenance of large ordered indices , 1970, SIGFIDET '70.

[26]  Philip S. Yu,et al.  Clustering algorithms for content-based publication-subscription systems , 2002, Proceedings 22nd International Conference on Distributed Computing Systems.

[27]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[28]  Deborah Estrin,et al.  A hierarchical proxy architecture for Internet-scale event services , 1999, Proceedings. IEEE 8th International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WET ICE'99).

[29]  Lili Qiu,et al.  The content and access dynamics of a busy Web site: findings and implications , 2000 .

[30]  Mario A. López,et al.  A greedy algorithm for bulk loading R-trees , 1998, GIS '98.

[31]  Guruduth Banavar,et al.  An efficient multicast protocol for content-based publish-subscribe systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[32]  Rajeev Rastogi,et al.  RE-tree: an efficient index structure for regular expressions , 2003, The VLDB Journal.

[33]  Rudolf Bayer,et al.  Organization and maintenance of large ordered indexes , 1972, Acta Informatica.

[34]  Mischa Schwartz,et al.  ACM SIGCOMM computer communication review , 2001, CCRV.

[35]  Carey L. Williamson,et al.  Internet Web servers: workload characterization and performance implications , 1997, TNET.