State-Aware Load Shedding From Input Event Streams in Complex Event Processing

In complex event processing (CEP), load shedding is performed to maintain a given latency bound during overload situations when there is a limitation on resources. However, shedding load implies degradation in the quality of results (QoR). Therefore, it is crucial to perform load shedding in a way that has the lowest impact on QoR. Researchers, in the CEP domain, propose to drop either events or partial matches (PMs) in overload cases. They assign utilities to events or PMs by considering either the importance of events or the importance of PMs but not both together. In this article, we combine these approaches where we propose to assign a utility to an event by considering both the event importance and the importance of PMs. We propose two load shedding approaches for CEP systems. The first approach drops events from PMs, while the second approach drops events from windows. We adopt a probabilistic model that uses the type and position of an event in a window and the state of a PM to assign a utility to an event. We, also, propose an approach to predict a utility threshold that is used to drop the required amount of events to maintain a given latency bound. By extensive evaluations on two real-world datasets and several representative queries, we show that, in the majority of cases, our load shedding approach outperforms state-of-the-art load shedding approaches, w.r.t. QoR.

[1]  K. Rothermel,et al.  hSPICE: state-aware event shedding in complex event processing , 2020, DEBS.

[2]  Matthias Weidlich,et al.  Load Shedding for Complex Event Processing: Input-based and State-based Techniques , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[3]  K. Rothermel,et al.  eSPICE: Probabilistic Load Shedding from Input Event Streams in Complex Event Processing , 2019, Middleware.

[4]  K. Rothermel,et al.  Combining it all: Cost minimal and low-latency stream processing across distributed heterogeneous infrastructures , 2019, Middleware.

[5]  Albert Flaig,et al.  pSPICE: Partial Match Shedding for Complex Event Processing , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[6]  Alexandros Labrinidis,et al.  Concept-Driven Load Shedding: Reducing Size and Error of Voluminous and Variable Data Streams , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[7]  Kurt Rothermel,et al.  Skipping Unused Events to Speed Up Rollback-Recovery in Distributed Data-Parallel CEP , 2018, 2018 IEEE/ACM 5th International Conference on Big Data Computing Applications and Technologies (BDCAT).

[8]  Kurt Rothermel,et al.  Expressive Content-Based Routing in Software-Defined Networks , 2018, IEEE Transactions on Parallel and Distributed Systems.

[9]  Christof Fetzer,et al.  StreamApprox: approximate computing for stream analytics , 2017, Middleware.

[10]  Kurt Rothermel,et al.  SPECTRE: supporting consumption policies in window-based parallel complex event processing , 2017, Middleware.

[11]  Kurt Rothermel,et al.  Addressing TCAM Limitations of Software-Defined Networks for Content-Based Routing , 2017, DEBS.

[12]  Tingjian Ge,et al.  History is a mirror to the future: Best-effort approximate complex event matching with insufficient resources , 2016, Proc. VLDB Endow..

[13]  Elke A. Rundensteiner,et al.  Scalable Pattern Sharing on Event Streams* , 2016, SIGMOD Conference.

[14]  Peter R. Pietzuch,et al.  THEMIS: Fairness in Federated Stream Processing under Overload , 2016, SIGMOD Conference.

[15]  Leonardo Querzoni,et al.  Load-aware shedding in stream processing systems , 2016, DEBS.

[16]  Dimitrios Gunopulos,et al.  Elastic complex event processing exploiting prediction , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[17]  Jeffrey Xu Yu,et al.  Auto-Approximation of Graph Computing , 2014, Proc. VLDB Endow..

[18]  Jeffrey F. Naughton,et al.  On Load Shedding in Complex Event Processing , 2013, ICDT.

[19]  Alessandro Margara,et al.  TESLA: a formally defined event specification language , 2010, DEBS '10.

[20]  Samuel Madden,et al.  ZStream: a cost-based query processor for adaptively detecting composite events , 2009, SIGMOD Conference.

[21]  Elke A. Rundensteiner,et al.  Sequence Pattern Query Processing over Out-of-Order Event Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[22]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[23]  Wee Hyong Tok,et al.  A stratified approach to progressive approximate joins , 2008, EDBT '08.

[24]  Stanley B. Zdonik,et al.  Window-aware load shedding for aggregation queries over data streams , 2006, VLDB.

[25]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[26]  C. Zaniolo,et al.  Expressing and optimizing sequence queries in database systems , 2004, TODS.

[27]  Michael Stonebraker,et al.  Load Shedding in a Data Stream Manager , 2003, VLDB.

[28]  Jennifer Widom,et al.  Adaptive filters for continuous queries over distributed data streams , 2003, SIGMOD '03.

[29]  Rainer Unland,et al.  On the semantics of complex events in active database management systems , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[30]  Sharma Chakravarthy,et al.  Snoop: An Expressive Event Specification Language for Active Databases , 1994, Data Knowl. Eng..

[31]  Sharma Chakravarthy,et al.  Composite Events for Active Databases: Semantics, Contexts and Detection , 1994, VLDB.

[32]  Klaus R. Dittrich,et al.  Events in an Active Object-Oriented Database System , 1993, Rules in Database Systems.