High-performance nested CEP query processing over event streams

Complex event processing (CEP) over event streams has become increasingly important for real-time applications ranging from health care, supply chain management to business intelligence. These monitoring applications submit complex queries to track sequences of events that match a given pattern. As these systems mature the need for increasingly complex nested sequence query support arises, while the state-of-art CEP systems mostly support the execution of flat sequence queries only. To assure real-time responsiveness and scalability for pattern detection even on huge volume high-speed streams, efficient processing techniques must be designed. In this paper, we first analyze the prevailing nested pattern query processing strategy and identify several serious shortcomings. Not only are substantial subsequences first constructed just to be subsequently discarded, but also opportunities for shared execution of nested subexpressions are overlooked. As foundation, we introduce NEEL, a CEP query language for expressing nested CEP pattern queries composed of sequence, negation, AND and OR operators. To overcome deficiencies, we design rewriting rules for pushing negation into inner subexpressions. Next, we devise a normalization procedure that employs these rules for flattening a nested complex event expression. To conserve CPU and memory consumption, we propose several strategies for efficient shared processing of groups of normalized NEEL subexpressions. These strategies include prefix caching, suffix clustering and customized “bit-marking” execution strategies. We design an optimizer to partition the set of all CEP subexpressions in a NEEL normal form into groups, each of which can then be mapped to one of our shared execution operators. Lastly, we evaluate our technologies by conducting a performance study to assess the CPU processing time using real-world stock trades data. Our results confirm that our NEEL execution in many cases performs 100 fold faster than the traditional iterative nested execution strategy for real stock market query workloads.

[1]  John Miles Smith,et al.  Optimizing the performance of a relational algebra database interface , 1975, CACM.

[2]  Matt Welsh,et al.  Sensor networks for medical care , 2005, SenSys '05.

[3]  Johannes Gehrke,et al.  Cayuga: A General Purpose Event Monitoring System , 2007, CIDR.

[4]  Chetan Gupta,et al.  CHAOS: A Data Stream Analysis Architecture for Enterprise Applications , 2009, 2009 IEEE Conference on Commerce and Enterprise Computing.

[5]  Timos K. Sellis,et al.  Multiple-query optimization , 1988, TODS.

[6]  Jun'ichi Tatemura,et al.  AFilter: adaptable XML filtering with prefix-caching suffix-clustering , 2006, VLDB.

[7]  Javier Bajo,et al.  Wireless Sensor Networks in Home Care , 2009, IWANN.

[8]  Chetan Gupta,et al.  Processing nested complex sequence pattern queries over event streams , 2010, DMSN '10.

[9]  Sheldon J. Finkelstein Common expression analysis in database applications , 1982, SIGMOD '82.

[10]  Norman May,et al.  Nested queries and quantifiers in an ordered context , 2004, Proceedings. 20th International Conference on Data Engineering.

[11]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[12]  Didier Pittet,et al.  Guideline for Hand Hygiene in Health-Care Settings. Recommendations of the Healthcare Infection Control Practices Advisory Committee and the HIPAC/SHEA/APIC/IDSA Hand Hygiene Task Force. , 2002, American journal of infection control.

[13]  Elke A. Rundensteiner,et al.  Sequence Pattern Query Processing over Out-of-Order Event Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[14]  Jonathan Goldstein,et al.  Consistent Streaming Through Time: A Vision for Event Stream Processing , 2006, CIDR.

[15]  Chetan Gupta,et al.  NEEL: The Nested Complex Event Language for Real-Time Event Analytics , 2010, BIRTE.

[16]  Hand-Hygiene Task Force Guideline for hand hygiene in healthcare settings. , 2004 .

[17]  Sharma Chakravarthy,et al.  Composite Events for Active Databases: Semantics, Contexts and Detection , 1994, VLDB.

[18]  Sven Helmer,et al.  Algebraic Optimization of Nested XPath Expressions , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[19]  Won Kim,et al.  On optimizing an SQL-like nested query , 1982, TODS.

[20]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[21]  Martin Klazar,et al.  Bell numbers, their relatives, and algebraic differential equations , 2003, J. Comb. Theory, Ser. A.

[22]  Neil Immerman,et al.  On Supporting Kleene Closure over Event Streams , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[23]  Samuel Madden,et al.  ZStream: a cost-based query processor for adaptively detecting composite events , 2009, SIGMOD Conference.

[24]  Hamid Pirahesh,et al.  Complex query decorrelation , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[25]  Chetan Gupta,et al.  E-Cube: Multi-dimensional event sequence processing using concept and pattern hierarchies , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).