Tolerating correlated failures in Massively Parallel Stream Processing Engines

Fault-tolerance techniques for stream processing engines can be categorized into passive and active approaches. A typical passive approach periodically checkpoints a processing task's runtime states and can recover a failed task by restoring its runtime state using its latest checkpoint. On the other hand, an active approach usually employs backup nodes to run replicated tasks. Upon failure, the active replica can take over the processing of the failed task with minimal latency. However, both approaches have their own inadequacies in Massively Parallel Stream Processing Engines (MPSPE). The passive approach incurs a long recovery latency especially when a number of correlated nodes fail simultaneously, while the active approach requires extra replication resources. In this paper, we propose a new fault-tolerance framework, which is Passive and Partially Active (PPA). In a PPA scheme, the passive approach is applied to all tasks while only a selected set of tasks will be actively replicated. The number of actively replicated tasks depends on the available resources. If tasks without active replicas fail, tentative outputs will be generated before the completion of the recovery process. We also propose effective and efficient algorithms to optimize a partially active replication plan to maximize the quality of tentative outputs. We implemented PPA on top of Storm, an open-source MPSPE and conducted extensive experiments using both real and synthetic datasets to verify the effectiveness of our approach.

[1]  Michael Stonebraker,et al.  High-availability algorithms for distributed stream processing , 2005, 21st International Conference on Data Engineering (ICDE'05).

[2]  Michael Stonebraker,et al.  Fault-tolerance in the Borealis distributed stream processing system , 2005, SIGMOD '05.

[3]  Jeong-Hyon Hwang,et al.  Fast and Highly-Available Stream Processing over Wide Area Networks , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[4]  Rajeev Motwani,et al.  Load shedding for aggregation queries over data streams , 2004, Proceedings. 20th International Conference on Data Engineering.

[5]  Ying Xing,et al.  A Cooperative, Self-Configuring High-Availability Solution for Stream Processing , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[6]  Scott Shenker,et al.  Adaptive Stream Processing using Dynamic Batch Sizing , 2014, SoCC.

[7]  Albert G. Greenberg,et al.  Fault-tolerant stream processing using a distributed, replicated file system , 2008, Proc. VLDB Endow..

[8]  Thomas S. Heinze,et al.  An adaptive replication scheme for elastic data stream processing systems , 2015, DEBS.

[9]  Paolo Bellavista,et al.  Adaptive Fault-Tolerance for Dynamic Resource Provisioning in Distributed Stream Processing Systems , 2014, EDBT.

[10]  Magdalena Balazinska,et al.  A latency and fault-tolerance optimizer for online parallel query plans , 2011, SIGMOD '11.

[11]  Bianca Schroeder,et al.  A Large-Scale Study of Failures in High-Performance Computing Systems , 2006, IEEE Transactions on Dependable and Secure Computing.

[12]  Andrey Brito,et al.  Active Replication at (Almost) No Cost , 2011, 2011 IEEE 30th International Symposium on Reliable Distributed Systems.

[13]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[14]  Yongluan Zhou,et al.  Dynamic Resource Management In a Massively Parallel Stream Processing Engine , 2015, CIKM.

[15]  Kun-Lung Wu,et al.  Fault injection-based assessment of partial fault tolerance in stream processing applications , 2011, DEBS '11.

[16]  Ashwin Arulselvan,et al.  A note on the set union knapsack problem , 2014, Discret. Appl. Math..

[17]  Yongluan Zhou,et al.  Integrating fault-tolerance and elasticity in a distributed data stream processing system , 2014, SSDBM '14.

[18]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[19]  Jeffrey F. Naughton,et al.  Evaluating window joins over unbounded streams , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[20]  Raul Castro Fernandez,et al.  Integrating scale out and fault tolerance in stream processing using operator state management , 2013, SIGMOD '13.

[21]  Deepak S. Turaga,et al.  Towards Optimal Resource Allocation in Partial-Fault Tolerant Applications , 2008, IEEE INFOCOM 2008 - The 27th Conference on Computer Communications.

[22]  Srinivasan Seshan,et al.  Subtleties in Tolerating Correlated Failures in Wide-area Storage Systems , 2006, NSDI.

[23]  Fan Ye,et al.  A Hybrid Approach to High Availability in Stream Processing Systems , 2010, 2010 IEEE 30th International Conference on Distributed Computing Systems.

[24]  Fan Ye,et al.  An empirical study of high availability in stream processing systems , 2009, Middleware.

[25]  Wei Lin,et al.  Advanced partitioning techniques for massively distributed computation , 2012, SIGMOD Conference.

[26]  Richard P. Martin,et al.  Improving cluster availability using workstation validation , 2002, SIGMETRICS '02.

[27]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.