Elastic Complex Event Processing under Varying Query Load

Distributed data stream processing systems, like Twitter Storm or Yahoo! S4, have been primarily focusing on adapting to varying event rates. However, as these systems are becoming increasingly multi-tenant, adaptation to the varying query load is becoming an equally important problem. In this paper we present FUGU – an elastic allocator for Complex Event Processing systems. FUGU uses bin packing to allocate continuous queries to a varying set of nodes. Driven by elasticity requirements FUGU maximizes the overall system utilization while trying to maintain stable processing latencies. The specific contributions of this paper are: (1) introduction of a re-balancing scheme for bin packing allowing FUGU to increase overall system utilization by six percent and (2) a detailed study of achievable system utilization and latency under real-life workload from Frankfurt Stock Exchange.

[1]  Jeffrey F. Naughton,et al.  Rate-based query optimization for streaming information sources , 2002, SIGMOD '02.

[2]  Kun-Lung Wu,et al.  Elastic scaling of data parallel operators in stream processing , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[3]  S. Martello,et al.  Algorithms for Knapsack Problems , 1987 .

[4]  Ying Li,et al.  Placement Strategies for Internet-Scale Data Stream Systems , 2008, IEEE Internet Computing.

[5]  Zbigniew Jerzak,et al.  HUGO: real-time analysis of component interactions in high-tech manufacturing equipment (industry article) , 2013, DEBS '13.

[6]  Holger Ziekow,et al.  The DEBS 2013 grand challenge , 2013, DEBS.

[7]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[8]  Joseph M. Hellerstein,et al.  Flux: an adaptive partitioning operator for continuous query systems , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[9]  Raul Castro Fernandez,et al.  Integrating scale out and fault tolerance in stream processing using operator state management , 2013, SIGMOD '13.

[10]  Ying Xing,et al.  Dynamic load distribution in the Borealis stream processor , 2005, 21st International Conference on Data Engineering (ICDE'05).

[11]  Rodrigo Fonseca,et al.  Managing parallelism for stream processing in the cloud , 2012, HotCDP '12.

[12]  Claudio Soriente,et al.  StreamCloud: An Elastic and Scalable Data Streaming System , 2012, IEEE Transactions on Parallel and Distributed Systems.

[13]  Edward G. Coffman,et al.  Approximation algorithms for bin packing: a survey , 1996 .

[14]  Asaf Adi,et al.  Complex Event Processing for Financial Services , 2006, 2006 IEEE Services Computing Workshops.