Availability-Consistency Trade-Offs in a Fault-Tolerant Stream Processing System

This paper presents an approach to fault-tolerant stream processing. In contrast to previous techniques that handle node failures, our approach also tolerates network failures and network partitions. The approach is based on a principled trade-off between consistency and availability in the face of failure, that (1) ensures that all data on an input stream is processed within a specified time threshold, but (2) reduces the impact of failures by limiting if possible the number of results produced based on partially available input data, and (3) corrects these results when failures heal. Our approach is well-suited for applications such as environment monitoring, where high availability and “real-time” response is preferable to perfect answers. Our approach uses replication and guarantees that all processing replicas achieve state consistency, both in the absence of failures and after a failure heals. We achieve consistency in the former case by defining a data-serializing operator that ensures that the order of tuples to a downstream operator is the same at all the replicas. To achieve consistency after a failure heals, we develop approaches based on checkpoint/redo and undo/redo techniques. We have implemented these schemes in a prototype distributed stream processing system, and present experimental results that show that the system meets the desired availability-consistency trade-offs.

[1]  Ying Xing,et al.  Scalable Distributed Stream Processing , 2003, CIDR.

[2]  Joseph M. Hellerstein,et al.  Eddies: continuously adaptive query processing , 2000, SIGMOD '00.

[3]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[4]  Irene Greif,et al.  Replicated document management in a group communication system , 1988, CSCW '88.

[5]  Jeffrey F. Naughton,et al.  Rate-based query optimization for streaming information sources , 2002, SIGMOD '02.

[6]  A. Elmagarmid Database transaction models for advanced applications , 1992 .

[7]  Theodore Johnson,et al.  Gigascope: a stream database for network applications , 2003, SIGMOD '03.

[8]  Eric A. Brewer,et al.  Highly available, fault-tolerant, parallel dataflows , 2004, SIGMOD '04.

[9]  David Maier,et al.  Applying Punctuation Schemes to Queries Over Continuous Data Streams , 2003, IEEE Data Engineering Bulletin.

[10]  Andreas Reuter,et al.  The ConTract Model , 1991, Database Transaction Models for Advanced Applications.

[11]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[12]  Michael Stonebraker,et al.  Operator Scheduling in a Data Stream Manager , 2003, VLDB.

[13]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[14]  Michael Stonebraker,et al.  High-availability algorithms for distributed stream processing , 2005, 21st International Conference on Data Engineering (ICDE'05).

[15]  Michael J. Franklin,et al.  Remembrance of Streams Past: Overload-Sensitive Management of Archived Streams , 2004, VLDB.

[16]  David K. Gifford,et al.  Weighted voting for replicated data , 1979, SOSP '79.

[17]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[18]  Mark R. Tuttle,et al.  A theory of redo recovery , 2003, SIGMOD '03.

[19]  Nick Feamster,et al.  Measuring the effects of internet path faults on reactive routing , 2003, SIGMETRICS '03.

[20]  Philip A. Bernstein,et al.  Implementing recoverable requests using queues , 1990, SIGMOD '90.

[21]  David J. DeWitt,et al.  The Niagara Internet Query System , 2001, IEEE Data Eng. Bull..

[22]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[23]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[24]  Abhinandan Das,et al.  Approximate join processing over data streams , 2003, SIGMOD '03.

[25]  Joseph M. Hellerstein,et al.  Partial results for online query processing , 2002, SIGMOD '02.

[26]  Gustavo Alonso,et al.  Providing High Availability in Very Large Worklflow Management Systems , 1996, EDBT.

[27]  Jennifer Widom,et al.  Adaptive filters for continuous queries over distributed data streams , 2003, SIGMOD '03.

[28]  Gustavo Alonso,et al.  WFMS: The Next Generation of Distributed Processing Tools , 1997, Advanced Transaction Models and Architectures.

[29]  Hector Garcia-Molina,et al.  How to assign votes in a distributed system , 1985, JACM.

[30]  Gustavo Alonso,et al.  Exotica/FMQM: A Persistent Message-Based Architecture for Distributed Workflow Management , 1995 .

[31]  Dennis Shasha,et al.  The dangers of replication and a solution , 1996, SIGMOD '96.

[32]  Peter A. Tucker,et al.  Dealing with Disorder ∗ , 2003 .

[33]  Jennifer Widom,et al.  Flexible time management in data stream systems , 2004, PODS.

[34]  Michael Stonebraker,et al.  Load Shedding in a Data Stream Manager , 2003, VLDB.

[35]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.

[36]  Eric A. Brewer,et al.  Lessons from Giant-Scale Services , 2001, IEEE Internet Comput..

[37]  Jennifer Widom,et al.  Approximate replication , 2003 .

[38]  Carlo Zaniolo,et al.  Query Languages and Data Models for Database Sequences and Data Streams , 2004, VLDB.

[39]  Rajeev Motwani,et al.  Chain: operator scheduling for memory minimization in data stream systems , 2003, SIGMOD '03.