Noria: dynamic, partially-stateful data-flow for high-performance web applications

We introduce partially-stateful data-flow, a new streaming data-flow model that supports eviction and reconstruction of data-flow state on demand. By avoiding state explosion and supporting live changes to the data-flow graph, this model makes data-flow viable for building long-lived, low-latency applications, such as web applications. Our implementation, Noria, simplifies the backend infrastructure for read-heavy web applications while improving their performance. A Noria application supplies a relational schema and a set of parameterized queries, which Noria compiles into a data-flow program that pre-computes results for reads and incrementally applies writes. Noria makes it easy to write high-performance applications without manual performance tuning or complex-to-maintain caching layers. Partial statefulness helps Noria limit its in-memory state without prior data-flow systems' restriction to windowed state, and helps Noria adapt its data-flow to schema and query changes while on-line. Unlike prior data-flow systems, Noria also shares state and computation across related queries, eliminating duplicate work. On a real web application's queries, our prototype scales to 5× higher load than a hand-optimized MySQL baseline. Noria also outperforms a typical MySQL/memcached stack and the materialized views of a commercial database. It scales to tens of millions of reads and millions of writes per second over multiple servers, outperforming a state-of-the-art streaming data-flow system.

[1]  Sheldon J. Finkelstein Common expression analysis in database applications , 1982, SIGMOD '82.

[2]  Frank Wm. Tompa,et al.  Maintaining materialized views without accessing base data , 1988, Inf. Syst..

[3]  Jennifer Widom,et al.  View maintenance in a warehousing environment , 1995, SIGMOD '95.

[4]  Roberta Cochrane,et al.  How to roll a join: asynchronous incremental view maintenance , 2000, SIGMOD 2000.

[5]  Efficient and Extensible Algorithms for Multi Query Optimization , 2000, SIGMOD Conference.

[6]  Surajit Chaudhuri,et al.  Automated Selection of Materialized Views and Indexes in SQL Databases , 2000, VLDB.

[7]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[8]  Sriram Padmanabhan,et al.  DBProxy: a dynamic data cache for web applications , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[9]  Myoung-Ho Kim,et al.  Optimizing the incremental maintenance of multiple join views , 2005, DOLAP '05.

[10]  Anastasia Ailamaki,et al.  QPipe: a simultaneously pipelined relational query engine , 2005, SIGMOD '05.

[11]  Inderpal Singh Mumick,et al.  Selection of Views to Materialize in a Data Warehouse , 2005, IEEE Trans. Knowl. Data Eng..

[12]  Jingren Zhou,et al.  Partially Materialized Views , 2005 .

[13]  Adam Wierman,et al.  Open Versus Closed: A Cautionary Tale , 2006, NSDI.

[14]  Inderpal Singh Mumick,et al.  Incremental maintenance of aggregate and outerjoin expressions , 2006, Inf. Syst..

[15]  Jingren Zhou,et al.  Efficient Maintenance of Materialized Outer-Join Views , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Luping Ding,et al.  Dynamic Materialized Views , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[17]  Hicham G. Elmongui,et al.  Lazy Maintenance of Materialized Views , 2007, VLDB.

[18]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[19]  Wolfgang Lehner,et al.  Efficient exploitation of similar subexpressions for query processing , 2007, SIGMOD '07.

[20]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[21]  Carlo Curino,et al.  Schema Evolution in Wikipedia - Toward a Web Information System Benchmark , 2008, ICEIS.

[22]  Werner Vogels,et al.  Eventually consistent , 2008, CACM.

[23]  Hans-Arno Jacobsen,et al.  PNUTS: Yahoo!'s hosted data serving platform , 2008, Proc. VLDB Endow..

[24]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[25]  George Candea,et al.  A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses , 2009, Proc. VLDB Endow..

[26]  Sanjeev Kumar,et al.  Finding a Needle in Haystack: Facebook's Photo Storage , 2010, OSDI.

[27]  Samuel Madden,et al.  Transactional Consistency and Automatic Management in an Application Data Cache , 2010, OSDI.

[28]  Mahadev Konar,et al.  ZooKeeper: Wait-free Coordination for Internet-scale Systems , 2010, USENIX Annual Technical Conference.

[29]  Lenin Ravindranath,et al.  Nectar: Automatic Management of Data and Computation in Datacenters , 2010, OSDI.

[30]  Steven Hand,et al.  CIEL: A Universal Execution Engine for Distributed Data-Flow Computing , 2011, NSDI.

[31]  Milos Nikolic,et al.  DBToaster: Higher-order Delta Processing for Dynamic, Frequently Fresh Views , 2012, Proc. VLDB Endow..

[32]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[33]  Gustavo Alonso,et al.  SharedDB: Killing One Thousand Queries With One Stone , 2012, Proc. VLDB Endow..

[34]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[35]  Dror G. Feitelson,et al.  Development and Deployment at Facebook , 2013, IEEE Internet Computing.

[36]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[37]  Daniel Mills,et al.  MillWheel: Fault-Tolerant Stream Processing at Internet Scale , 2013, Proc. VLDB Endow..

[38]  M. Abadi,et al.  Naiad: a timely dataflow system , 2013, SOSP.

[39]  Hui Ding,et al.  TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[40]  Scott Shenker,et al.  Discretized streams: fault-tolerant streaming computation at scale , 2013, SOSP.

[41]  Michael Isard,et al.  Differential Dataflow , 2013, CIDR.

[42]  Cory Hill,et al.  f4: Facebook's Warm BLOB Storage System , 2014, OSDI.

[43]  Eddie Kohler,et al.  Easy Freshness with Pequod Cache Joins , 2014, NSDI.

[44]  Armando Solar-Lezama,et al.  Precise, dynamic information flow for database-backed applications , 2015, PLDI.

[45]  Michael Stonebraker,et al.  S-Store: Streaming Meets Transaction Processing , 2015, Proc. VLDB Endow..

[46]  Seif Haridi,et al.  Apache Flink™: Stream and Batch Processing in a Single Engine , 2015, IEEE Data Eng. Bull..

[47]  Jignesh M. Patel,et al.  Twitter Heron: Stream Processing at Scale , 2015, SIGMOD Conference.

[48]  Anshul Jaiswal,et al.  Realtime Data Processing at Facebook , 2016, SIGMOD Conference.

[49]  Milos Nikolic,et al.  How to Win a Hot Dog Eating Contest: Distributed Incremental View Maintenance with Batch Updates , 2016, SIGMOD Conference.

[50]  Val Tannen,et al.  Incremental View Maintenance For Collection Programming , 2014, PODS.

[51]  Laurie A. Williams,et al.  Continuous Deployment at Facebook and OANDA , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C).

[52]  Jennifer Widom,et al.  STREAM: The Stanford Data Stream Management System , 2016, Data Stream Management.

[53]  Adam Chlipala,et al.  A program optimization for automatic database result caching , 2017, POPL.

[54]  Ingrid Nunes,et al.  Understanding Application-Level Caching in Web Applications , 2017, ACM Comput. Surv..

[55]  Michael I. Jordan,et al.  Ray: A Distributed Framework for Emerging AI Applications , 2017, OSDI.

[56]  Martín Abadi,et al.  Falkirk Wheel: Rollback Recovery for Dataflow Systems , 2015, ArXiv.