Complex event analytics: online aggregation of stream sequence patterns

Complex Event Processing (CEP) is a technology of choice for high performance analytics in time-critical decision-making applications. Yet while effective technologies for complex pattern detection on continuous event streams have been developed, the problem of scalable online aggregation of such patterns has been overlooked. Instead, aggregation is typically applied as a post processing step after CEP pattern detection, leading to an extremely ineffective solution. In this paper, we demonstrate that CEP aggregation can be pushed into the sequence construction process. Based on this insight our A-Seq strategy successfully aggregates sequence pattern online without ever constructing sequence matches. This drives down the complexity of the CEP aggregation problem from polynomial to linear. We further extend our A-Seq strategy to support the shared processing of concurrent CEP aggregation queries. The A-Seq solution is shown to achieve over four orders of magnitude performance improvement for a wide range of tested scenarios compared to the state-of-the-art solution.

[1]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[2]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[3]  Johannes Gehrke,et al.  Cayuga: A General Purpose Event Monitoring System , 2007, CIDR.

[4]  Chetan Gupta,et al.  CHAOS: A Data Stream Analysis Architecture for Enterprise Applications , 2009, 2009 IEEE Conference on Commerce and Enterprise Computing.

[5]  Miron Livny,et al.  The Design and Implementation of a Sequence Database System , 1996, VLDB.

[6]  Dennis Shasha,et al.  AQuery: Query Language for Ordered Data, Optimization Techniques, and Experiments , 2003, VLDB.

[7]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[8]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[9]  C. Zaniolo,et al.  Expressing and optimizing sequence queries in database systems , 2004, TODS.

[10]  Dimitrios Gunopulos,et al.  Temporal Aggregation over Data Streams Using Multiple Granularities , 2002, EDBT.

[11]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[12]  Chetan Gupta,et al.  E-Cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing , 2011, SIGMOD '11.

[13]  Samuel Madden,et al.  ZStream: a cost-based query processor for adaptively detecting composite events , 2009, SIGMOD Conference.

[14]  David Wai-Lok Cheung,et al.  OLAP on sequence data , 2008, SIGMOD Conference.

[15]  Michael H. Böhlen,et al.  Sequenced spatio-temporal aggregation in road networks , 2009, EDBT '09.

[16]  Carlo Zaniolo,et al.  Temporal aggregation in active database rules , 1997, SIGMOD '97.

[17]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[18]  Michael J. Franklin,et al.  On-the-fly sharing for streamed aggregation , 2006, SIGMOD Conference.

[19]  Jiawei Han,et al.  Stream Sequential Pattern Mining with Precise Error Bounds , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[20]  David Maier,et al.  Semantics and evaluation techniques for window aggregates in data streams , 2005, SIGMOD '05.