Enactment of adaptation in data stream processing with latency implications - A systematic literature review

Abstract Context Stream processing is a popular paradigm to continuously process huge amounts of data. Runtime adaptation plays a significant role in supporting the optimization of data processing tasks. In recent years runtime adaptation has received significant interest in scientific literature. However, so far no categorization of the enactment approaches for runtime adaptation in stream processing has been established. Objective This paper identifies and characterizes different approaches towards the enactment of runtime adaptation in stream processing with a main focus on latency as quality dimension. Method We performed a systematic literature review (SLR) targeting five main research questions. An automated search, resulting in 244 papers, was conducted. 75 papers published between 2006 and 2018 were finally included. From the selected papers, we extracted data like processing problems, adaptation goals, enactment approaches of adaptation, enactment techniques, evaluation metrics as well as evaluation parameters used to trigger the enactment of adaptation in their evaluation. Results We identified 17 different enactment approaches and categorized them into a taxonomy. For each, we extracted the underlying technique used to implement this enactment approach. Further, we identified 9 categories of processing problems, 6 adaptation goals, 9 evaluation metrics and 12 evaluation parameters according to the extracted data properties. Conclusion We observed that the research interest on enactment approaches to the adaptation of stream processing has significantly increased in recent years. The most commonly applied enactment approaches are parameter adaptation to tune parameters or settings of the processing, load balancing used to re-distribute workloads, and processing scaling to dynamically scale up and down the processing. In addition to latency, most adaptations also address resource fluctuation / bottleneck problems. For presenting a dynamic environment to evaluate enactment approaches, researchers often change input rates or processing workloads.

[1]  Alexandros Labrinidis,et al.  Avoiding class warfare: managing continuous queries with differentiated classes of service , 2015, The VLDB Journal.

[2]  M. Tamer Özsu,et al.  Adaptive input admission and management for parallel stream processing , 2013, DEBS.

[3]  Minghua Chen,et al.  Migration Towards Cloud-Assisted Live Media Streaming , 2016, IEEE/ACM Transactions on Networking.

[4]  Kajal T. Claypool,et al.  Teddies: Trained Eddies for Reactive Stream Processing , 2008, DASFAA.

[5]  Marco Danelutto,et al.  Elastic-PPQ: A two-level autonomic system for spatial preference query processing over dynamic data streams , 2018, Future Gener. Comput. Syst..

[6]  Gabriele Mencagli A Game-Theoretic Approach for Elastic Distributed Data Stream Processing , 2016, TAAS.

[7]  Pearl Brereton,et al.  Lessons from applying the systematic literature review process within the software engineering domain , 2007, J. Syst. Softw..

[8]  Yu Liu,et al.  AdaptStream: towards achieving fluidity in adaptive stream-based systems , 2011, SAC '11.

[9]  Johannes Gehrke,et al.  Cayuga: A General Purpose Event Monitoring System , 2007, CIDR.

[10]  Holger Eichelberger,et al.  Impact-minimizing Runtime Switching of Distributed Stream Processing Algorithms , 2016, EDBT/ICDT Workshops.

[11]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[12]  Venkatesh Tamarapalli,et al.  Delay Management in Mesh-Based P2P Live Streaming Using a Three-Stage Peer Selection Strategy , 2017, Journal of Network and Systems Management.

[13]  Gabriel Antoniu,et al.  JetStream: Enabling high throughput live event streaming on multi-site clouds , 2016, Future Gener. Comput. Syst..

[14]  Kun-Lung Wu,et al.  Dynamic Load Balancing for Ordered Data-Parallel Regions in Distributed Streaming Systems , 2016, Middleware.

[15]  Nesime Tatbul,et al.  Changing flights in mid-air: a model for safely modifying continuous queries , 2011, SIGMOD '11.

[16]  Cao Liang,et al.  SmartCell: An Energy Efficient Coarse-Grained Reconfigurable Architecture for Stream-Based Applications , 2009, EURASIP J. Embed. Syst..

[17]  Patrick P. C. Lee,et al.  Toward High-Performance Distributed Stream Processing via Approximate Fault Tolerance , 2016, Proc. VLDB Endow..

[18]  Cesare Pautasso,et al.  Liquid Stream Processing Across Web Browsers and Web Servers , 2015, ICWE.

[19]  Shivnath Babu,et al.  Processing Forecasting Queries , 2007, VLDB.

[20]  Danny Weyns,et al.  A systematic literature review on methods that handle multiple quality attributes in architecture-based self-adaptive systems , 2017, Inf. Softw. Technol..

[21]  Deng Pan,et al.  Query Adaptation Techniques in Temporal-DHT for P2P Media Streaming Applications , 2012, Int. J. Multim. Data Eng. Manag..

[22]  Daniele D. Giusto,et al.  Streaming video over wireless channels: Exploiting reduced-reference quality estimation at the user-side , 2012, Signal Process. Image Commun..

[23]  Sunita Mahajan,et al.  A Survey of Issues of Query Optimization in Parallel Databases , 2010 .

[24]  Sebastian VanSyckel,et al.  A survey on engineering approaches for self-adaptive systems , 2015, Pervasive Mob. Comput..

[25]  Nada Lavrac,et al.  Stream-based active learning for sentiment analysis in the financial domain , 2014, Inf. Sci..

[26]  Yongluan Zhou,et al.  Dynamic Resource Management In a Massively Parallel Stream Processing Engine , 2015, CIKM.

[27]  Yogesh L. Simmhan,et al.  Adaptive rate stream processing for smart grid applications on clouds , 2011, ScienceCloud '11.

[28]  Scott Shenker,et al.  Adaptive Stream Processing using Dynamic Batch Sizing , 2014, SoCC.

[29]  Feng Xia,et al.  QoS4IVSaaS: a QoS management framework for intelligent video surveillance as a service , 2016, Personal and Ubiquitous Computing.

[30]  Yan Su,et al.  Achieving self-aware parallelism in stream programs , 2014, Cluster Computing.

[31]  Gabriel Antoniu,et al.  JetStream: enabling high performance event streaming across cloud data-centers , 2014, DEBS '14.

[32]  Norman W. Paton,et al.  Adaptive Query Processing: A Survey , 2002, BNCOD.

[33]  Steffen Becker,et al.  Model-driven performance engineering of self-adaptive systems: a survey , 2012, QoSA '12.

[34]  Yan Liu,et al.  On the source switching problem of Peer-to-Peer streaming , 2010, J. Parallel Distributed Comput..

[35]  Beng Chin Ooi,et al.  Scalable Distributed Stream Join Processing , 2015, SIGMOD Conference.

[36]  Indranil Gupta,et al.  Henge: Intent-driven Multi-Tenant Stream Processing , 2018, SoCC.

[37]  Guoliang Xing,et al.  A Quality-Aware Voice Streaming System for Wireless Sensor Networks , 2014, ACM Trans. Sens. Networks.

[38]  Noel De Palma,et al.  Locality-Aware Routing in Stateful Streaming Applications , 2016, Middleware.

[39]  Elke A. Rundensteiner,et al.  Scalable stream join processing with expensive predicates: workload distribution and adaptation by time-slicing , 2009, EDBT '09.

[40]  Yi Ding,et al.  Adaptive resource management for P2P live streaming systems , 2013, Future Gener. Comput. Syst..

[41]  Laxmi N. Bhuyan,et al.  E-AHRW: An Energy-Efficient Adaptive Hash Scheduler for Stream Processing on Multi-core Servers , 2011, 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems.

[42]  Karsten Schwan,et al.  Active workflow system for near real-time extreme-scale science , 2014, PPAA '14.

[43]  Anshul Jaiswal,et al.  Providing Streaming Joins as a Service at Facebook , 2018, Proc. VLDB Endow..

[44]  Zhiyi Huang,et al.  P-Scheduler: adaptive hierarchical scheduling in apache storm , 2016, ACSW.

[45]  Matthias Jarke,et al.  Query Optimization in Database Systems , 1984, CSUR.

[46]  Zhi Jin,et al.  A Systematic Literature Review of Requirements Modeling and Analysis for Self-adaptive Systems , 2014, REFSQ.

[47]  Mark Staples,et al.  Experiences using systematic review guidelines , 2006, J. Syst. Softw..

[48]  Indranil Gupta,et al.  New techniques to curtail the tail latency in stream processing systems , 2016, DCC '16.

[49]  Claudia Niederée,et al.  Adaptive Application Performance Management for Big Data Stream Processing , 2015, Softwaretechnik-Trends.

[50]  Pearl Brereton,et al.  Evidence-Based Software Engineering and Systematic Reviews , 2015 .

[51]  Robert Stephens,et al.  A survey of stream processing , 1997, Acta Informatica.

[52]  Hrishikesh Amur,et al.  ELF: Efficient Lightweight Fast Stream Processing at Scale , 2014, USENIX Annual Technical Conference.

[53]  Aoying Zhou,et al.  Parallel Stream Processing Against Workload Skewness and Variance , 2017, HPDC.

[54]  Philip S. Yu,et al.  Correlating burst events on streaming stock market data , 2007, Data Mining and Knowledge Discovery.

[55]  Odej Kao,et al.  Massively-parallel stream processing under QoS constraints with Nephele , 2012, HPDC '12.

[56]  Dimitrios Gunopulos,et al.  Dynamic Load Balancing Techniques for Distributed Complex Event Processing Systems , 2016, DAIS.

[57]  Li Su,et al.  Enorm: efficient window-based computation in large-scale distributed stream processing systems , 2016, DEBS.

[58]  Danny Weyns,et al.  MAPE-K Formal Templates to Rigorously Design Behaviors for Self-Adaptive Systems , 2015, ACM Trans. Auton. Adapt. Syst..

[59]  Sharma Chakravarthy,et al.  SnoopIB: Interval-Based Event Specification and Detection for Active Databases , 2003, ADBIS.

[60]  Sriram Rao,et al.  Dhalion: Self-Regulating Stream Processing in Heron , 2017, Proc. VLDB Endow..

[61]  Mohamed Medhat Gaber,et al.  Learning from Data Streams: Processing Techniques in Sensor Networks , 2007 .

[62]  Bo Li,et al.  Design and deployment of a hybrid CDN-P2P system for live video streaming: experiences with LiveSky , 2009, ACM Multimedia.

[63]  Mohamed Hefeeda,et al.  Adaptive streaming of interactive free viewpoint videos to heterogeneous clients , 2016, MMSys.

[64]  Elisa Bertino,et al.  Self-tuning query mesh for adaptive multi-route query processing , 2009, EDBT '09.

[65]  Nicolas Hidalgo,et al.  Self-adaptive processing graph with operator fission for elastic stream processing , 2017, J. Syst. Softw..

[66]  Archan Misra,et al.  Adaptive data acquisition strategies for energy-efficient, smartphone-based, continuous processing of sensor streams , 2012, Distributed and Parallel Databases.

[67]  Thomas S. Heinze,et al.  An adaptive replication scheme for elastic data stream processing systems , 2015, DEBS.

[68]  Xirong Que,et al.  QoE-driven optimization for cloud-assisted DASH-based scalable interactive multiview video streaming over wireless network , 2017, Signal Process. Image Commun..

[69]  Valeria Cardellini,et al.  Decentralized self-adaptation for elastic Data Stream Processing , 2018, Future Gener. Comput. Syst..

[70]  Christof Fetzer,et al.  Quality-Driven Continuous Query Execution over Out-of-Order Data Streams , 2015, SIGMOD Conference.

[71]  Sanath Jayasena,et al.  Latency Aware Elastic Switching-based Stream Processing Over Compressed Data Streams , 2017, ICPE.

[72]  Zhengping Qian,et al.  TimeStream: reliable stream computation in the cloud , 2013, EuroSys '13.

[73]  Xabiel G. Pañeda,et al.  Adaptable system based on Scalable Video Coding for high-quality video service , 2013, Comput. Electr. Eng..

[74]  Maya Daneva,et al.  On the pragmatic design of literature studies in software engineering: an experience-based guideline , 2016, Empirical Software Engineering.

[75]  Songkuk Kim,et al.  Adaptive interface selection over cloud-based split-layer video streaming via multi-wireless networks , 2016, Future Gener. Comput. Syst..

[76]  Mohammed Odeh,et al.  A Survey of Distributed Query Optimization , 2005, Int. Arab J. Inf. Technol..

[77]  Ladan Tahvildari,et al.  Self-adaptive software: Landscape and research challenges , 2009, TAAS.

[78]  Sang Hyuk Son,et al.  RTSTREAM: real-time query processing for data streams , 2006, Ninth IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC'06).

[79]  Edward A. Lee,et al.  AWStream: adaptive wide-area streaming analytics , 2018, SIGCOMM.

[80]  Tiziano De Matteis,et al.  Proactive elasticity and energy awareness in data stream processing , 2017, J. Syst. Softw..

[81]  Stratis Viglas,et al.  Fast Heuristics for Near-Optimal Task Allocation in Data Stream Processing over Clusters , 2014, CIKM.

[82]  Chinya V. Ravishankar,et al.  Real-time, load-adaptive processing of continuous queries over data streams , 2008, DEBS.

[83]  Frederick Reiss,et al.  TelegraphCQ: continuous dataflow processing , 2003, SIGMOD '03.

[84]  Donald Kossmann,et al.  The state of the art in distributed query processing , 2000, CSUR.

[85]  Rodrigo Fonseca,et al.  C-MR: continuously executing MapReduce workflows on multi-core processors , 2012, MapReduce '12.

[86]  Tore Dybå,et al.  Strength of evidence in systematic reviews in software engineering , 2008, ESEM '08.

[87]  Jae-Gil Lee,et al.  Continuous query processing in data streams using duality of data and queries , 2006, SIGMOD Conference.

[88]  Sharma Chakravarthy,et al.  Stream Data Processing: A Quality of Service Perspective - Modeling, Scheduling, Load Shedding, and Complex Event Processing , 2009, Advances in Database Systems.

[89]  Ji Wu,et al.  Data-driven memory management for stream join , 2009, Inf. Syst..

[90]  Philip S. Yu,et al.  SPADE: the system s declarative stream processing engine , 2008, SIGMOD Conference.

[91]  Mohammad Hosseini,et al.  Dynamic Adaptive Point Cloud Streaming , 2018, PV@MMSys.

[92]  Kyong-Ho Lee,et al.  An adaptive plan-based approach to integrating semantic streams with remote RDF data , 2017, J. Inf. Sci..

[93]  Sebastian Rudolph,et al.  Stream reasoning and complex event processing in ETALIS , 2012, Semantic Web.

[94]  Danh Le Phuoc,et al.  A Native and Adaptive Approach for Unified Processing of Linked Streams and Linked Data , 2011, SEMWEB.

[95]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[96]  Geetika T. Lakshmanan,et al.  Biologically-Inspired Distributed Middleware Management for Stream Processing Systems , 2008, Middleware.

[97]  Ali Ghodsi,et al.  Drizzle: Fast and Adaptable Stream Processing at Scale , 2017, SOSP.

[98]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[99]  Holger Eichelberger,et al.  Enactment of Adaptation in Data Stream Processing with Latency Implications , 2020, SE.

[100]  Elke A. Rundensteiner,et al.  Practical Identification of Dynamic Precedence Criteria to Produce Critical Results from Big Data Streams , 2015, Big Data Res..

[101]  Robert Grimm,et al.  A catalog of stream processing optimizations , 2014, ACM Comput. Surv..

[102]  Rafael Vasconcelos,et al.  Design and Evaluation of an Autonomous Load Balancing System for Mobile Data Stream Processing Based On a Data Centric Publish Subscribe Approach , 2014, Int. J. Adapt. Resilient Auton. Syst..

[103]  Bugra Gedik,et al.  Fundamentals of Stream Processing: Application Design, Systems, and Analytics , 2014 .

[104]  Rodolfo E. Haber,et al.  Self-adaptive systems: A survey of current approaches, research challenges and applications , 2013, Expert Syst. Appl..

[105]  Rui Huang,et al.  Analysing and evaluating topology structure of online application in Big Data stream computing environment , 2016, Int. J. Wirel. Mob. Comput..

[106]  Michael Dahlin,et al.  FlightPath: Obedience vs. Choice in Cooperative Services , 2008, OSDI.

[107]  Kurt Rothermel,et al.  MCEP: A Mobility-Aware Complex Event Processing System , 2014, ACM Trans. Internet Techn..

[108]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[109]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[110]  Hwangnam Kim,et al.  Link-Aware Reconfigurable Point-to-Point Video Streaming for Mobile Devices , 2015, TOMM.

[111]  Luiz Fernando Bittencourt,et al.  Attributed Graph Rewriting for Complex Event Processing Self-Management , 2016, ACM Trans. Auton. Adapt. Syst..

[112]  Xiaoming Li,et al.  Input-adaptive parallel sparse fast fourier transform for stream processing , 2014, ICS '14.

[113]  Mohamed Hefeeda,et al.  A DASH-based Free Viewpoint Video Streaming System , 2014, NOSSDAV 2014.

[114]  Boris Koldehofe,et al.  TCEP: Adapting to Dynamic User Environments by Enabling Transitions between Operator Placement Mechanisms , 2018, DEBS.

[115]  Jun Zhou,et al.  Adaptive segment-based patching scheme for video streaming delivery system , 2006, Comput. Commun..

[116]  Karl Aberer,et al.  Toward Massive Query Optimization in Large-Scale Distributed Stream Systems , 2008, Middleware.

[117]  Songqing Chen,et al.  Delving into internet streaming media delivery: a quality and resource utilization perspective , 2006, IMC '06.

[118]  Marco Mellia,et al.  A delay-based aggregate rate control for P2P streaming systems , 2012, Comput. Commun..