Distributed Sequence Pattern Detection Over Multiple Data Streams

Sequence pattern detection over streaming data has many real world applications. Most of the present work is aimed to process sequence queries over single data stream. Situations where streaming data arrive from multiple sources have not been explored much. In traditional approaches a single centralized machine handles and processes sequence queries over multiple data streams. While running sequence queries on a single server, even though many of the events in data streams do not lead to successful pattern detection they are still handled and processed by the server. This consumes precious network bandwidth, server’s computing resources and precious time. In this paper we focus on sequence pattern detection, where patterns are defined on chains of events that arrive from multiple distributed data streams. We propose a three layer distributed framework to avoid unnecessary event processing by the server, and to efficiently process sequence queries to detect sequence patterns relying upon chains of events. The bottom layer of data sources sends continuous data streams to the middle layer, which then performs pattern detection locally, and on the basis of the feedback received from the top layer of global server, sends events to the global server to detect complete patterns. Our present work is aimed to detect sequence patterns over multiple data streams, but, our proposed model can be extended to many other areas of distributed stream processing.

[1]  Gang Chen,et al.  Sequence Pattern Matching over Time-Series Data with Temporal Uncertainty , 2014, EDBT.

[2]  Hua Lu,et al.  Continuous Skyline Monitoring over Distributed Data Streams , 2010, SSDBM.

[3]  Michael Stonebraker,et al.  The 8 requirements of real-time stream processing , 2005, SGMD.

[4]  Yanlei Diao,et al.  High-performance complex event processing over streams , 2006, SIGMOD Conference.

[5]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[6]  Martin Wolf,et al.  Efficient Pattern Detection Over a Distributed Framework , 2014, BIRTE.

[7]  Rajeev Motwani,et al.  Operator scheduling in data stream systems , 2004, VLDB 2004.

[8]  Martin Hirzel,et al.  Partition and compose: parallel complex event processing , 2012, DEBS.

[9]  Ji Wu,et al.  Towards integrated and efficient scientific sensor data processing: a database approach , 2009, EDBT '09.

[10]  Miron Livny,et al.  Sequence query processing , 1994, SIGMOD '94.

[11]  Lukasz Golab,et al.  Issues in data stream management , 2003, SGMD.

[12]  Hans-Arno Jacobsen,et al.  Towards highly parallel event processing through reconfigurable hardware , 2011, DaMoN '11.

[13]  Samuel Madden,et al.  ZStream: a cost-based query processor for adaptively detecting composite events , 2009, SIGMOD Conference.

[14]  Nesime Tatbul,et al.  RIP: run-based intra-query parallelism for scalable complex event processing , 2013, DEBS.

[15]  Ji Wu,et al.  QoS-Oriented Multi-query Scheduling over Data Streams , 2009, DASFAA.

[16]  Johannes Gehrke,et al.  Distributed event stream processing with non-deterministic finite automata , 2009, DEBS '09.

[17]  Xiaoming Zhang,et al.  Complex Event Processing over distributed probabilistic event streams , 2012, 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery.

[18]  Mohamed A. Sharaf,et al.  Scheduling continuous queries in data stream management systems , 2008, Proc. VLDB Endow..

[19]  Carlo Zaniolo,et al.  Query Languages and Data Models for Database Sequences and Data Streams , 2004, VLDB.

[20]  Peter R. Pietzuch,et al.  Distributed complex event processing with query rewriting , 2009, DEBS '09.

[21]  Elke A. Rundensteiner,et al.  Sequence Pattern Query Processing over Out-of-Order Event Streams , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[22]  Murali Mani Efficient Event Stream Processing: Handling Ambiguous Events and Patterns with Negation , 2011, DASFAA Workshops.

[23]  Xin Li,et al.  Complex Event Processing over Uncertain Data Streams , 2010, 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing.

[24]  Sharma Chakravarthy,et al.  Scheduling Strategies for Processing Continuous Queries over Streams , 2004, BNCOD.