Bigtable: A Distributed Storage System for Structured Data

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this article, we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

[1]  John McCarthy,et al.  Recursive functions of symbolic expressions and their computation by machine, Part I , 1960, Commun. ACM.

[2]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[3]  Jim Gray,et al.  Notes on Data Base Operating Systems , 1978, Advanced Course: Operating Systems.

[4]  Douglas Comer,et al.  Ubiquitous B-Tree , 1979, CSUR.

[5]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[6]  D. Gawlick,et al.  Varieties of Concurrency Control in IMS/VS Fast Path. , 1985 .

[7]  Michael Stonebraker,et al.  The Case for Shared Nothing , 1985, HPTS.

[8]  Robert B. Hagmann,et al.  Reimplementing the Cedar file system using logging and group commit , 1987, SOSP '87.

[9]  Tom W. Keller,et al.  Data placement in Bubba , 1988, SIGMOD '88.

[10]  David J. DeWitt,et al.  Parallel database systems: the future of high performance database systems , 1992, CACM.

[11]  Michael Stonebraker,et al.  Mariposa: a new architecture for distributed data , 1994, Proceedings of 1994 IEEE 10th International Conference on Data Engineering.

[12]  Clark D. French,et al.  “One size fits all” database architectures do not work for DSS , 1995, SIGMOD '95.

[13]  Chaitanya K. Baru,et al.  DB2 Parallel Edition , 1995, IBM Syst. J..

[14]  John H. Hartman,et al.  The Zebra striped network file system , 1995, TOCS.

[15]  Leslie Lamport,et al.  The part-time parliament , 1998, TOCS.

[16]  Rick Greer,et al.  Daytona and the fourth-generation language Cymbal , 1999, SIGMOD '99.

[17]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[18]  Jon Louis Bentley,et al.  Data compression using long common strings , 1999, Proceedings DCC'99 Data Compression Conference (Cat. No. PR00096).

[19]  David J. DeWitt,et al.  Weaving Relations for Cache Performance , 2001, VLDB.

[20]  Peter Druschel,et al.  Pastry: Scalable, distributed object location and routing for large-scale peer-to- , 2001 .

[21]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM 2001.

[22]  Ben Y. Zhao,et al.  An Infrastructure for Fault-tolerant Wide-area Location and Routing , 2001 .

[23]  Mark Handley,et al.  A scalable content-addressable network , 2001, SIGCOMM '01.

[24]  Antony I. T. Rowstron,et al.  Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems , 2001, Middleware.

[25]  Robert Morris,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM 2001.

[26]  David R. Karger,et al.  Chord: A scalable peer-to-peer lookup service for internet applications , 2001, SIGCOMM '01.

[27]  Ben Y. Zhao,et al.  Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and , 2001 .

[28]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[29]  GhemawatSanjay,et al.  The Google file system , 2003 .

[30]  Marc Najork,et al.  Boxwood: Abstractions as the Foundation for Storage Infrastructure , 2004, OSDI.

[31]  David E. Culler,et al.  Operating Systems Support for Planetary-Scale Network Services , 2004, NSDI.

[32]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[33]  Michael Stonebraker,et al.  C-Store: A Column-oriented DBMS , 2005, VLDB.

[34]  Marcin Zukowski,et al.  MonetDB/X100 - A DBMS In The CPU Cache , 2005, IEEE Data Eng. Bull..

[35]  Rob Pike,et al.  Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..

[36]  Daniel J. Abadi,et al.  Integrating compression and execution in column-oriented database systems , 2006, SIGMOD Conference.

[37]  Brett D. Fleisch,et al.  The Chubby lock service for loosely-coupled distributed systems , 2006, OSDI '06.

[38]  Robert Griesemer,et al.  Paxos made live: an engineering perspective , 2007, PODC '07.

[39]  Ozalp Babaoglu,et al.  ACM Transactions on Computer Systems , 2007 .

[40]  Patrick E. O'Neil,et al.  The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.