Foundations of Complex Event Processing

Complex Event Processing (CEP) has emerged as the unifying field for technologies that require processing and correlating heterogeneous distributed data sources in real-time. CEP finds applications in diverse domains, which has resulted in a large number of proposals for expressing and processing complex events. However, existing CEP frameworks are based on ad-hoc solutions that do not rely on solid theoretical ground, making them hard to understand, extend or generalize. Moreover, they are usually presented as application programming interfaces documented by examples, and using each of them requires learning a different set of skills. In this paper we embark on the task of giving a rigorous framework to CEP. As a starting point, we propose a formal language for specifying complex events, called CEPL, that contains the common features used in the literature and has a simple and denotational semantics. We also formalize the so-called selection strategies, which are the cornerstone of CEP and had only been presented as by-design extensions to existing frameworks. With a well-defined semantics at hand, we study how to efficiently evaluate CEPL for processing complex events. We provide optimization results based on rewriting formulas to a normal form that simplifies the evaluation of filters. Furthermore, we introduce a formal computational model for CEP based on transducers and symbolic automata, called match automata, that captures the regular core of CEPL, i.e. formulas with unary predicates. By using rewriting techniques and automata-based translations, we show that formulas in the regular core of CEPL can be evaluated using constant time per event followed by constant-delay enumeration of the output (under data complexity). By gathering these results together, we propose a framework for efficiently evaluating CEPL, establishing solid foundations for future CEP systems.

[1]  Jeffrey F. Naughton,et al.  On Load Shedding in Complex Event Processing , 2013, ICDT.

[2]  Jonathan Goldstein,et al.  Consistent Streaming Through Time: A Vision for Event Stream Processing , 2006, CIDR.

[3]  Matthias Weidlich,et al.  Complex Event Recognition Languages: Tutorial , 2017, DEBS.

[4]  Luc Segoufin,et al.  Enumerating with constant delay the answers to a query , 2013, ICDT '13.

[5]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[6]  Alexander Artikis,et al.  An Event Calculus for Event Recognition , 2015, IEEE Transactions on Knowledge and Data Engineering.

[7]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[8]  Sebastian Rudolph,et al.  A Rule-Based Language for Complex Event Processing and Reasoning , 2010, RR.

[9]  Morris Sloman,et al.  GEM: a generalized event monitoring language for distributed systems , 1997, Distributed Syst. Eng..

[10]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[11]  Rajeev Alur,et al.  A Theory of Timed Automata , 1994, Theor. Comput. Sci..

[12]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[13]  David J. DeWitt,et al.  NiagaraCQ: a scalable continuous query system for Internet databases , 2000, SIGMOD 2000.

[14]  Paola Mello,et al.  A Logic-Based, Reactive Calculus of Events , 2010, Fundam. Informaticae.

[15]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[16]  Jennifer Widom,et al.  STREAM: the stanford stream data manager (demonstration description) , 2003, SIGMOD '03.

[17]  Todd L. Heberlein,et al.  Network intrusion detection , 1994, IEEE Network.

[18]  Alessandro Margara,et al.  RACED: an adaptive middleware for complex event detection , 2009, ARM '09.

[19]  Johannes Gehrke,et al.  Towards Expressive Publish/Subscribe Systems , 2006, EDBT.

[20]  Umeshwar Dayal,et al.  The architecture of an active database management system , 1989, SIGMOD '89.

[21]  Neil Immerman,et al.  On complexity and optimization of expensive queries in complex event processing , 2014, SIGMOD Conference.

[22]  Jayanthi Ranjan,et al.  Real time business intelligence in supply chain analytics , 2008, Inf. Manag. Comput. Secur..

[23]  Neil Immerman,et al.  On Supporting Kleene Closure over Event Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[24]  Johannes Gehrke,et al.  Database Management Systems, -3/E. , 2014 .

[25]  Ugur Çetintemel,et al.  Plan-based complex event detection across distributed sources , 2008, Proc. VLDB Endow..

[26]  Chetan Gupta,et al.  E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing , 2011, SIGMOD '11.

[27]  Alessandro Margara,et al.  Complex event processing with T-REX , 2012, J. Syst. Softw..

[28]  Peter R. Pietzuch,et al.  A Framework for Event Composition in Distributed Systems , 2003, Middleware.

[29]  Björn Lisper,et al.  A resource-efficient event algebra , 2010, Sci. Comput. Program..

[30]  Rainer Unland,et al.  On the semantics of complex events in active database management systems , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[31]  David C. Luckham,et al.  Rapide: A language and toolset for simulation of distributed systems by partial orderings of events , 1997, Partial Order Methods in Verification.

[32]  Mikell P. Groover,et al.  Automation, Production Systems, and Computer-Integrated Manufacturing , 1987 .

[33]  Serge Abiteboul,et al.  Foundations of Databases: The Logical Level , 1995 .

[34]  Mariano Zelke,et al.  Algorithmic Techniques for Processing Data Streams , 2013, Data Exchange, Information, and Streams.

[35]  Alexander Artikis,et al.  Logic-based event recognition , 2012, The Knowledge Engineering Review.

[36]  Jeffrey F. Naughton,et al.  On the complexity of privacy-preserving complex event processing , 2011, PODS.

[37]  Peter R. Pietzuch,et al.  Distributed complex event processing with query rewriting , 2009, DEBS '09.

[38]  Michael Stonebraker,et al.  Aurora: a data stream management system , 2003, SIGMOD '03.

[39]  Johannes Gehrke,et al.  A General Algebra and Implementation for Monitoring Event Streams , 2005 .

[40]  Johannes Gehrke,et al.  What is "next" in event processing? , 2007, PODS.