Towards a One Size Fits All Database Architecture

We propose a new type of database system coined OctopusDB. Our approach suggests a unified, one size fits all data processing architecture for OLTP, OLAP, streaming systems, and scan-oriented database systems. OctopusDB radically departs from existing architectures in the following way: it uses a logical event log as its primary storage structure. To make this approach efficient we introduce the concept of Storage Views (SV), i.e. secondary, alternative physical data representations covering all or subsets of the primary log. OctopusDB (1) allows us to use different types of SVs for different subsets of the data; and (2) eliminates the need to use different types of database systems for different applications. Thus, based on the workload, OctopusDB emulates different types of systems (row stores, column stores, streaming systems, and more importantly, any hybrid combination of these). This is a feature impossible to achieve with traditional DBMSs.

[1]  Clark D. French Teaching an OLTP database kernel advanced datawarehousing techniques , 1997, Proceedings 13th International Conference on Data Engineering.

[2]  Daniel J. Abadi,et al.  Column-stores vs. row-stores: how different are they really? , 2008, SIGMOD Conference.

[3]  Peter Boncz,et al.  Column-Oriented Database Systems (Tutorial) , 2009 .

[4]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[5]  Frederick Reiss,et al.  Constant-Time Query Processing , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[6]  Michael Stonebraker,et al.  One Size Fits All? Part 2: Benchmarking Studies , 2007, CIDR.

[7]  Michael Stonebraker,et al.  A Demonstration of SciDB: A Science-Oriented DBMS , 2009, Proc. VLDB Endow..

[8]  David J. DeWitt,et al.  How to barter bits for chronons: compression and bandwidth trade offs for database scans , 2007, SIGMOD '07.

[9]  George Candea,et al.  A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses , 2009, Proc. VLDB Endow..

[10]  Nicolas Bruno Teaching an Old Elephant New Tricks , 2009, CIDR.

[11]  Gerhard Weikum,et al.  Rethinking Database System Architecture: Towards a Self-Tuning RISC-Style Database System , 2000, VLDB.

[12]  Arjen P. de Vries,et al.  Efficient and Flexible Information Retrieval using MonetDB/X100 , 2007, CIDR.

[13]  Douglas B. Terry,et al.  Continuous queries over append-only databases , 1992, SIGMOD '92.

[14]  Donald Kossmann,et al.  AGILE: adaptive indexing for context-aware information filters , 2005, SIGMOD '05.

[15]  Hasso Plattner,et al.  A common database approach for OLTP and OLAP using an in-memory column database , 2009, SIGMOD Conference.

[16]  Surajit Chaudhuri,et al.  Table of Contents (pdf) , 2007, VLDB.

[17]  Gustavo Alonso,et al.  Predictable Performance for Unpredictable Workloads , 2009, Proc. VLDB Endow..

[18]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[19]  F. Tödtling,et al.  One size fits all?: Towards a differentiated regional innovation policy approach , 2005 .

[20]  Clark D. French,et al.  “One size fits all” database architectures do not work for DSS , 1995, SIGMOD '95.

[21]  Martin L. Kersten,et al.  Database Cracking , 2007, CIDR.

[22]  Marvin H. Solomon,et al.  The GMAP: a versatile tool for physical data independence , 1996, The VLDB Journal.

[23]  Michael Stonebraker,et al.  "One Size Fits All": An Idea Whose Time Has Come and Gone (Abstract) , 2005, ICDE.

[24]  Samuel Madden,et al.  The Case for RodentStore: An Adaptive, Declarative Storage System , 2009, CIDR.

[25]  Marcin Zukowski,et al.  Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS , 2007, VLDB.

[26]  Ravi Kumar,et al.  Pig latin: a not-so-foreign language for data processing , 2008, SIGMOD Conference.

[27]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[28]  Michael Stonebraker,et al.  The case for partial indexes , 1989, SGMD.

[29]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[30]  Marcin Zukowski,et al.  MonetDB/X100: Hyper-Pipelining Query Execution , 2005, CIDR.

[31]  Nesime Tatbul,et al.  DejaVu: declarative pattern matching over live and archived streams of events , 2009, SIGMOD Conference.

[32]  Gerhard Weikum,et al.  TopX: efficient and versatile top-k query processing for semistructured data , 2007, The VLDB Journal.