Meeting predictable buffer limits in the parallel execution of event processing operators

Complex Event Processing (CEP) systems enable applications to react to live-situations by detecting event patterns (complex events) in data streams. With the increasing number of data sources and the increasing volume at which data is produced, parallelization of event detection is becoming of tremendous importance to limit the time events need to be buffered before they actually can be processed by an event detector - named event processing operator. In this paper, we propose a pattern-sensitive partitioning model for data streams that is capable of achieving a high degree of parallelism for event patterns which formerly could only be consistently detected in a sequential manner or at a low parallelization degree. Moreover, we propose methods to dynamically adapt the parallelization degree to limit the buffering imposed on event detection in the presence of dynamic changes to the workload. Extensive evaluations of the system behavior show that the proposed partitioning model allows for a high degree of parallelism and that the proposed adaptation methods are able to meet the buffering level for event detection under high and dynamic workloads.

[1]  Leonardo Neumeyer,et al.  S4: Distributed Stream Computing Platform , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[2]  Andrey Brito,et al.  Scalable and Low-Latency Data Processing with Stream MapReduce , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[3]  Kun-Lung Wu,et al.  Auto-parallelizing stateful distributed streaming applications , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[4]  Kurt Rothermel,et al.  RECEP: selection-based reuse for distributed complex event processing , 2014, DEBS '14.

[5]  Robert Grimm,et al.  A catalog of stream processing optimizations , 2014, ACM Comput. Surv..

[6]  Samuel Kounev,et al.  Self-adaptive workload classification and forecasting for proactive resource provisioning , 2013, ICPE '13.

[7]  Andrey Brito,et al.  Scalable and elastic realtime click stream analysis using StreamMine3G , 2014, DEBS '14.

[8]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[9]  Sharma Chakravarthy,et al.  Snoop: An Expressive Event Specification Language for Active Databases , 1994, Data Knowl. Eng..

[10]  M. Tamer Özsu,et al.  Adaptive input admission and management for parallel stream processing , 2013, DEBS.

[11]  Christina A. Christie,et al.  The Chi-Square Test , 2012 .

[12]  Nesime Tatbul,et al.  RIP: run-based intra-query parallelism for scalable complex event processing , 2013, DEBS.

[13]  Sharma Chakravarthy,et al.  Seamless Event and Data Stream Processing: Reconciling Windows and Consumption Modes , 2011, DASFAA.

[14]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[15]  Guoqiang Mao,et al.  Road traffic density estimation in vehicular networks , 2013, 2013 IEEE Wireless Communications and Networking Conference (WCNC).

[16]  Kurt Rothermel,et al.  Rollback-recovery without checkpoints in distributed event processing systems , 2013, DEBS '13.

[17]  Thomas S. Heinze,et al.  Latency-aware elastic scaling for distributed data stream processing systems , 2014, DEBS '14.

[18]  Badrish Chandramouli,et al.  Accurate latency estimation in a distributed event processing system , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[19]  M. Stephens EDF Statistics for Goodness of Fit and Some Comparisons , 1974 .

[20]  Kurt Rothermel,et al.  MCEP: A Mobility-Aware Complex Event Processing System , 2014, ACM Trans. Internet Techn..

[21]  Kun-Lung Wu,et al.  Elastic Scaling for Data Stream Processing , 2014, IEEE Transactions on Parallel and Distributed Systems.

[22]  Samuel Kounev,et al.  Performance Modeling and Evaluation of Distributed Component-Based Systems Using Queueing Petri Nets , 2006, IEEE Transactions on Software Engineering.

[23]  Raul Castro Fernandez,et al.  Integrating scale out and fault tolerance in stream processing using operator state management , 2013, SIGMOD '13.

[24]  Henk Tijms New and old results for the M/D/c queue , 2006 .

[25]  Dror G. Feitelson,et al.  Workload Modeling for Computer Systems Performance Evaluation , 2015 .

[26]  Kurt Rothermel,et al.  Moving range queries in distributed complex event processing , 2012, DEBS.

[27]  S. Srinivasagopalan,et al.  A complex-event-processing framework for smart-grid management , 2012, 2012 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support.

[28]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[29]  Benoît Turquety M=M=M , 2015 .

[30]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[31]  Vincenzo Guerriero,et al.  Power Law Distribution: Method of Multi-scale Inferential Statistics , 2012 .

[32]  Kurt Rothermel,et al.  Cordies: expressive event correlation in distributed systems , 2010, DEBS '10.

[33]  Dominic Battré,et al.  Nephele/PACTs: a programming model and execution framework for web-scale analytical processing , 2010, SoCC '10.

[34]  Kurt Rothermel,et al.  MigCEP: operator migration for mobility driven distributed complex event processing , 2013, DEBS.