PreDict: Predictive Dictionary Maintenance for Message Compression in Publish/Subscribe

Data usage is a significant concern, particularly in smartphone applications, M2M communications and for Internet of Things (IoT) applications. Messages in these domains are often exchanged with a backend infrastructure using publish/subscribe (pub/sub). Shared dictionary compression has been shown to reduce data usage in pub/sub networks beyond that obtained using well-known techniques, such as DEFLATE, gzip and delta encoding, but such compression requires manual configuration, which increases the operational complexity. To address this challenge, we design a new dictionary maintenance algorithm called PreDict that adjusts its operation over time by adapting its parameters to the message stream and that amortizes the resulting compression-induced bandwidth overhead by enabling high compression ratios. PreDict observes the message stream, takes the costs specific to pub/sub into account and uses machine learning and parameter fitting to adapt the parameters of dictionary compression to match the characteristics of the streaming messages continuously over time. The primary goal is to reduce the overall bandwidth of data dissemination without any manual parameterization. PreDict reduces the overall bandwidth by 72.6% on average. Furthermore, the technique reduces the computational overhead by ≈ 2x for publishers and by ≈ 1.4x for subscribers compared to the state of the art using manually selected parameters. In challenging configurations that have many more publishers (10k) than subscribers (1), the overall bandwidth reductions are more than 2x higher than that obtained by the state of the art.

[1]  Anne-Marie Kermarrec,et al.  The many faces of publish/subscribe , 2003, CSUR.

[2]  Roberto Beraldi,et al.  Efficient Publish/Subscribe Through a Self-Organizing Broker Overlay and its Application to SIENA , 2007, Comput. J..

[3]  Hans-Arno Jacobsen,et al.  The PADRES Distributed Publish/Subscribe System , 2005, FIW.

[4]  Paulo Ferreira,et al.  Radiator - efficient message propagation in context-aware systems , 2013, Journal of Internet Services and Applications.

[5]  Miguel Castro,et al.  SCRIBE: The Design of a Large-Scale Event Notification Infrastructure , 2001, Networked Group Communication.

[6]  Peter R. Pietzuch,et al.  Hermes: a distributed event-based middleware architecture , 2002, Proceedings 22nd International Conference on Distributed Computing Systems Workshops.

[7]  Sneha A. Dalvi,et al.  Internet of Things for Smart Cities , 2017 .

[8]  Anne-Marie Kermarrec,et al.  XL peer-to-peer pub/sub systems , 2013, ACM Comput. Surv..

[9]  Jon Louis Bentley,et al.  Data compression with long repeated strings , 2001, Inf. Sci..

[10]  Emanuel Florentin Olariu,et al.  Reliable messaging to millions of users with migratorydata , 2017, Middleware '17.

[11]  Holger Ziekow,et al.  The DEBS 2015 grand challenge , 2015, DEBS.

[12]  David S. Rosenblum,et al.  Design and evaluation of a wide-area event notification service , 2001, TOCS.

[13]  Magnus Skjegstad,et al.  REAP: Delta Compression for Publish/Subscribe Web Services in MANETs , 2013, MILCOM 2013 - 2013 IEEE Military Communications Conference.

[14]  Maarten van Steen,et al.  The hidden pub/sub of spotify: (industry article) , 2013, DEBS '13.

[15]  Sven Helmer,et al.  The implementation and performance of compressed databases , 2000, SGMD.

[16]  Ingo Müller,et al.  Adaptive String Dictionary Compression in In-Memory Column-Store Database Systems , 2014, EDBT.

[17]  Kaiwen Zhang,et al.  Incremental Topology Transformation for Publish/Subscribe Systems Using Integer Programming , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).

[18]  Ranveer Chandra,et al.  FarmBeats: An IoT Platform for Data-Driven Agriculture , 2017, NSDI.

[19]  Jiannong Cao,et al.  Middleware for Wireless Sensor Networks: A Survey , 2008, Journal of Computer Science and Technology.

[20]  Kaiwen Zhang,et al.  Demo Abstract: MOS: A Bandwidth-Efficient Cross-Platform Middleware for Publish/Subscribe , 2016, Middleware Posters and Demos.

[21]  Hans-Arno Jacobsen,et al.  Highly-available content-based publish/subscribe via gossiping , 2016, DEBS.

[22]  Amitangshu Pal,et al.  IoT-Based Sensing and Communications Infrastructure for the Fresh Food Supply Chain , 2018, Computer.