JetStream: Enabling high throughput live event streaming on multi-site clouds

Scientific and commercial applications operate nowadays on tens of cloud datacenters around the globe, following similar patterns: they aggregate monitoring or sensor data, assess the QoS or run global data mining queries based on inter-site event stream processing. Enabling fast data transfers across geographically distributed sites allows such applications to manage the continuous streams of events in real time and quickly react to changes. However, traditional event processing engines often consider data resources as second-class citizens and support access to data only as a side-effect of computation (i.e. they are not concerned by the transfer of events from their source to the processing site). This is an efficient approach as long as the processing is executed in a single cluster where nodes are interconnected by low latency networks. In a distributed environment, consisting of multiple datacenters, with orders of magnitude differences in capabilities and connected by a WAN, this will undoubtedly lead to significant latency and performance variations. This is namely the challenge we address in this paper, by proposing JetStream, a high performance batch-based streaming middleware for efficient transfers of events between cloud datacenters. JetStream is able to self-adapt to the streaming conditions by modeling and monitoring a set of context parameters. It further aggregates the available bandwidth by enabling multi-route streaming across cloud sites, while at the same time optimizing resource utilization and increasing cost efficiency. The prototype was validated on tens of nodes from US and Europe datacenters of the Windows Azure cloud with synthetic benchmarks and a real-life application monitoring the ALICE experiment at CERN. The results show a 3x increase of the transfer rate using the adaptive multi-route streaming, compared to state of the art solutions.

[1]  Nesime Tatbul,et al.  Stream as You Go: The Case for Incremental Data Access and Processing in the Cloud , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[2]  Robert L. Grossman,et al.  Sector and Sphere: the design and implementation of a high-performance data cloud , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[3]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[4]  Xiaoyuan Yang,et al.  Inter-datacenter bulk transfers with netstitcher , 2011 .

[5]  Gustavo Alonso,et al.  Flexible and scalable storage management for data-intensive stream processing , 2009, EDBT '09.

[6]  Reda Alhajj,et al.  Adaptive query processing in data stream management systems under limited memory resources , 2010, PIKM '10.

[7]  Bing Zhang,et al.  StorkCloud: data transfer scheduling and optimization as a service , 2013, Science Cloud '13.

[8]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[9]  Fabio Claudio Ferracchiati,et al.  In the Cloud , 2011 .

[10]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[11]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[12]  Paul N. Martinaitis,et al.  Component-based stream processing "in the cloud" , 2009, CBHPC '09.

[13]  Rui Wang,et al.  Bridging Data in the Clouds: An Environment-Aware System for Geographically Distributed Data Transfers , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[14]  Tim Kraska,et al.  Stormy: an elastic and highly available streaming service in the cloud , 2012, EDBT-ICDT '12.

[15]  Bingsheng He,et al.  Comet: batched stream processing for data intensive distributed computing , 2010, SoCC '10.

[16]  Hanif D. Sherali,et al.  Multiple Description Video Multicast in Wireless Ad Hoc Networks , 2006, Mob. Networks Appl..

[17]  Yoonho Park,et al.  SPC: a distributed, scalable platform for data mining , 2006, DMSSP '06.

[18]  Kurt Rothermel,et al.  Distributed spectral cluster management: a method for building dynamic publish/subscribe systems , 2012, DEBS.

[19]  David R. Karger,et al.  Chord: a scalable peer-to-peer lookup protocol for internet applications , 2003, TNET.

[20]  Toyotaro Suzumura,et al.  Elastic Stream Computing with Clouds , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[21]  Kurt Rothermel,et al.  Efficient content-based routing with network topology inference , 2013, DEBS.

[22]  Mohamed A. Sharaf,et al.  Tuning QoD in stream processing engines , 2010, ADC.

[23]  Kurt Rothermel,et al.  Meeting subscriber‐defined QoS constraints in publish/subscribe systems , 2011, Concurr. Comput. Pract. Exp..

[24]  Tevfik Kosar,et al.  Network-aware end-to-end data throughput optimization , 2011, NDM '11.

[25]  Li Su,et al.  Grand challenge: MapReduce-style processing of fast sensor data , 2013, DEBS '13.

[26]  Philip S. Yu,et al.  SPADE: the system s declarative stream processing engine , 2008, SIGMOD Conference.

[27]  Divyakant Agrawal,et al.  Meghdoot: Content-Based Publish/Subscribe over P2P Networks , 2004, Middleware.

[28]  Christian Esposito,et al.  Interconnecting Federated Clouds by Using Publish-Subscribe Service , 2013, Cluster Computing.

[29]  Marília Curado,et al.  Adaptive video-aware FEC-based mechanism with unequal error protection scheme , 2013, SAC '13.

[30]  Vijay Laxmi,et al.  A rate adaptive and multipath routing protocol to support video streaming in MANETs , 2012, ICACCI '12.

[31]  Mark Handley,et al.  Data center networking with multipath TCP , 2010, Hotnets-IX.

[32]  David Maier,et al.  Scientific Exploration in the Era of Ocean Observatories , 2008, Computing in Science & Engineering.

[33]  Ling Liu,et al.  Quality-aware dstributed data delivery for continuous query services , 2006, SIGMOD Conference.

[34]  Gabriel Antoniu,et al.  JetStream: enabling high performance event streaming across cloud data-centers , 2014, DEBS '14.

[35]  Thomas Plagemann,et al.  Adaptive sized windows to improve real-time health monitoring: a case study on heart attack prediction , 2010, MIR '10.

[36]  Thomas Hauser,et al.  Campus bridging made easy via Globus services , 2012, XSEDE '12.

[37]  Nesime Tatbul,et al.  RIP: run-based intra-query parallelism for scalable complex event processing , 2013, DEBS.

[38]  M. Tamer Özsu,et al.  Adaptive input admission and management for parallel stream processing , 2013, DEBS.

[39]  Gabriel Antoniu,et al.  Evaluating Streaming Strategies for Event Processing Across Infrastructure Clouds , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[40]  Ciprian Dobre,et al.  MonALISA: An agent based, dynamic service system to monitor, control and optimize distributed systems , 2009, Comput. Phys. Commun..

[41]  Badrish Chandramouli,et al.  Accurate latency estimation in a distributed event processing system , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[42]  Donald F. Towsley,et al.  Path Selection and Multipath Congestion Control , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.