DBToaster: Agile Views for a Dynamic Data Management System

This paper calls for a new breed of lightweight systems { dynamic data management systems (DDMS). In a nutshell, a DDMS manages large dynamic data structures with agile, frequently fresh views, and provides a facility for monitoring these views and triggering application-level events. We motivate DDMS with applications in large-scale data analytics, database monitoring, and high-frequency algorithmic trading. We compare DDMS to more traditional data management systems architectures. We present the DBToaster project, which is an ongoing eort to develop a prototype DDMS system. We describe its architecture design, techniques for high-frequency incremental view maintenance, storage, scaling up by parallelization, and the various key challenges to overcome to make DDMS a reality.

[1]  David F. Bacon,et al.  Compiler transformations for high-performance computing , 1994, CSUR.

[2]  Michael Stonebraker,et al.  The case for partial indexes , 1989, SGMD.

[3]  Frank Dabek,et al.  Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[4]  Felix Klaedtke,et al.  Policy Monitoring in First-Order Temporal Logic , 2010, CAV.

[5]  Hicham G. Elmongui,et al.  Lazy Maintenance of Materialized Views , 2007, VLDB.

[6]  Patricia G. Selinger,et al.  Support for repetitive transactions and ad hoc queries in System R , 1981, TODS.

[7]  Benjamin Reed,et al.  The life and times of a zookeeper , 2009, PODC '09.

[8]  Liuba Shrira,et al.  Promises: linguistic support for efficient asynchronous procedure calls in distributed systems , 1988, PLDI '88.

[9]  Johannes Gehrke,et al.  What is "next" in event processing? , 2007, PODS.

[10]  Leonid Libkin,et al.  Incremental maintenance of views with duplicates , 1995, SIGMOD '95.

[11]  Daniel Deutch,et al.  On probabilistic fixpoint and Markov chain query languages , 2010, PODS '10.

[12]  Limsoon Wong,et al.  Principles of Programming with Complex Objects and Collection Types , 1995, Theor. Comput. Sci..

[13]  Johannes Gehrke,et al.  Cayuga: a high-performance event processing engine , 2007, SIGMOD '07.

[14]  Marvin Theimer,et al.  Managing update conflicts in Bayou, a weakly connected replicated storage system , 1995, SOSP.

[15]  Nimrod Megiddo,et al.  Range queries in OLAP data cubes , 1997, SIGMOD '97.

[16]  Walid G. Aref,et al.  Supporting views in data stream management systems , 2010, TODS.

[17]  Michael Stonebraker,et al.  Aurora: a new model and architecture for data stream management , 2003, The VLDB Journal.

[18]  Christopher Olston,et al.  Interactive Analysis of Web-Scale Data , 2009, CIDR.

[19]  Luping Ding,et al.  Dynamic Materialized Views , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[20]  Olivier Danvy,et al.  Defunctionalization at work , 2001, PPDP '01.

[21]  Yannis Papakonstantinou,et al.  Ajax-based report pages as incrementally rendered views , 2010, SIGMOD Conference.

[22]  Hans-Jörg Schek,et al.  The relational model with relation-valued attributes , 1986, Inf. Syst..

[23]  Mohamed Ziauddin,et al.  Materialized Views in Oracle , 1998, VLDB.

[24]  Jennifer Widom,et al.  Practical Applications of Triggers and Constraints: Success and Lingering Issues (10-Year Award) , 2000, VLDB.

[25]  Christoph Koch,et al.  Incremental query evaluation in a ring of databases , 2010, PODS.

[26]  Latha S. Colby,et al.  Algorithms for deferred view maintenance , 1996, SIGMOD '96.

[27]  Brian Beckman,et al.  LINQ: reconciling object, relations and XML in the .NET framework , 2006, SIGMOD Conference.

[28]  Jeffrey F. Naughton,et al.  Maximizing the Output Rate of Multi-Way Join Queries over Streaming Information Sources , 2003, VLDB.

[29]  Neil Immerman,et al.  Efficient pattern matching over event streams , 2008, SIGMOD Conference.

[30]  Philip Wadler,et al.  Links: Web Programming Without Tiers , 2006, FMCO.

[31]  Margo I. Seltzer,et al.  Berkeley DB , 1999, USENIX Annual Technical Conference, FREENIX Track.

[32]  Torsten Grust,et al.  FERRY: database-supported program execution , 2009, SIGMOD Conference.

[33]  Ingmar Weber,et al.  The CompleteSearch Engine: Interactive, Efficient, and Towards IR& DB Integration , 2007, CIDR.

[34]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[35]  Christoph Koch,et al.  DBToaster: A SQL Compiler for High-Performance Delta Processing in Main-Memory Databases , 2009, Proc. VLDB Endow..

[36]  Gerald J. Sussman,et al.  Scheme: A Interpreter for Extended Lambda Calculus , 1998, High. Order Symb. Comput..

[37]  Jennifer Widom,et al.  Practical Applications of Triggers and Constraints: Successes and Lingering Issues , 2000 .

[38]  Jennifer Widom,et al.  Towards a streaming SQL standard , 2008, Proc. VLDB Endow..

[39]  Andrew McCallum,et al.  Scalable probabilistic databases with factor graphs and MCMC , 2010, Proc. VLDB Endow..

[40]  Subramanian Arumugam,et al.  The DataPath system: a data-centric analytic processing engine for large data warehouses , 2010, SIGMOD Conference.

[41]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[42]  Philip Wadler,et al.  Deforestation for Higher-Order Functions , 1992, Functional Programming.

[43]  Stratis Viglas,et al.  Generating code for holistic query evaluation , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[44]  Nick Roussopoulos,et al.  An incremental access method for ViewCache: concept, algorithms, and cost analysis , 1991, TODS.

[45]  Frederick Reiss,et al.  TelegraphCQ: Continuous Dataflow Processing for an Uncertain World , 2003, CIDR.