Shared dictionary compression in publish/subscribe systems

Publish/subscribe is known as a scalable and efficient data dissemination mechanism. Its efficiency comes from the optimized routing algorithms, yet few works exist on employing compression to save bandwidth, which is especially important in mobile environments. State of the art compression methods such as GZip or Deflate can be generally employed to compress messages. In this paper, we show how to reduce bandwidth even further by employing Shared Dictionary Compression (SDC) in pub/sub. However, SDC requires a dictionary to be generated and disseminated prior to compression, which introduces additional computational and bandwidth overhead. To support SDC, we propose a novel and lightweight protocol for pub/sub which employs a new class of brokers, called sampling brokers. Our solution generates, and disseminates dictionaries using the sampling brokers. Dictionary maintenance is performed regularly using an adaptive algorithm. The evaluation of our proposed design shows that it is possible to compensate for the introduced overhead and achieve significant bandwidth reduction over Deflate.

[1]  Reza Sherafat Kazemzadeh,et al.  The PADRES Publish/Subscribe System , 2010, Principles and Applications of Distributed Event-Based Systems.

[2]  Magnus Skjegstad,et al.  REAP: Delta Compression for Publish/Subscribe Web Services in MANETs , 2013, MILCOM 2013 - 2013 IEEE Military Communications Conference.

[3]  Peter Deutsch,et al.  GZIP file format specification version 4.3 , 1996, RFC.

[4]  H. E. White Printed english compression by dictionary encoding , 1967 .

[5]  Jon Louis Bentley,et al.  Data compression using long common strings , 1999, Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096).

[6]  Walter F. Tichy,et al.  Delta algorithms: an empirical analysis , 1998, TSEM.

[7]  David R. Karger,et al.  Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web , 1997, STOC '97.

[8]  Jin Li,et al.  Reducing replication bandwidth for distributed document databases , 2015, SoCC.

[9]  Jon Louis Bentley,et al.  Data compression with long repeated strings , 2001, Inf. Sci..

[10]  Jun Wei,et al.  MERC: Match at Edge and Route intra--Cluster for Content-based Publish/Subscribe Systems , 2015, Middleware.

[11]  Kenneth Mixter,et al.  A Proposal for Shared Dictionary Compression over HTTP , 2016 .

[12]  Carsten Binnig,et al.  Dictionary-based order-preserving string compression for main memory column stores , 2009, SIGMOD Conference.

[13]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[14]  Holger Ziekow,et al.  The DEBS 2015 grand challenge , 2015, DEBS.