论文信息 - Rethinking main memory OLTP recovery

Rethinking main memory OLTP recovery

Fine-grained, record-oriented write-ahead logging, as exemplified by systems like ARIES, has been the gold standard for relational database recovery. In this paper, we show that in modern high-throughput transaction processing systems, this is no longer the optimal way to recover a database system. In particular, as transaction throughputs get higher, ARIES-style logging starts to represent a non-trivial fraction of the overall transaction execution time. We propose a lighter weight, coarse-grained command logging technique which only records the transactions that were executed on the database. It then does recovery by starting from a transactionally consistent checkpoint and replaying the commands in the log as if they were new transactions. By avoiding the overhead of fine-grained logging of before and after images (both CPU complexity as well as substantial associated 110), command logging can yield significantly higher throughput at run-time. Recovery times for command logging are higher compared to an ARIEs-style physiological logging approach, but with the advent of high-availability techniques that can mask the outage of a recovering node, recovery speeds have become secondary in importance to run-time performance for most applications. We evaluated our approach on an implementation of TPCC in a main memory database system (VoltDB), and found that command logging can offer 1.5 x higher throughput than a main-memory optimized implementation of ARIEs-style physiological logging.

[1] Ippokratis Pandis,et al. Data-oriented transaction execution , 2010, Proc. VLDB Endow..

[2] Hector Garcia-Molina,et al. Main Memory Database Systems: An Overview , 1992, IEEE Trans. Knowl. Data Eng..

[3] Slawomir Pilarski,et al. Checkpointing for Distributed Databases: Starting from the Basics , 1992, IEEE Trans. Parallel Distributed Syst..

[4] K. M. Chandy,et al. Incremental Recovery In Main Memory Database Systems , 1992 .

[5] Michael Stonebraker,et al. OLTP through the looking glass, and what we found there , 2008, SIGMOD Conference.

[6] Ippokratis Pandis,et al. PLP: Page Latch-free Shared-everything OLTP , 2011, Proc. VLDB Endow..

[7] Michael J. Cahill. Serializable isolation for snapshot databases , 2009, TODS.

[8] Alfons Kemper,et al. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[9] Hamid Pirahesh,et al. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[10] Michael J. Carey,et al. A Concurrency Control Algorithm for Memory-Resident Database Systems , 1989, FODO.

[11] David B. Lomet,et al. Implementing Performance Competitive Logical Recovery , 2011, Proc. VLDB Endow..

[12] Ippokratis Pandis,et al. Aether: A Scalable Approach to Logging , 2010, Proc. VLDB Endow..

[13] Carlo Curino,et al. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems , 2012, SIGMOD Conference.

[14] S. Sudarshan,et al. Dalí: A High Performance Main Memory Storage Manager , 1994, VLDB.

[15] Michael Stonebraker,et al. Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[16] Jack A. Orenstein,et al. The ObjectStore database system , 1991, CACM.

[17] Parag Agrawal,et al. The case for RAMClouds: scalable high-performance storage entirely in DRAM , 2010, OPSR.

[18] Xi Li,et al. Post-crash log processing for fuzzy checkpointing main memory databases , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[19] S. Sudarshan,et al. Recovering from Main-Memory Lapses , 1993, VLDB.

[20] Jun-Lin Lin,et al. Segmented fuzzy checkpointing for main memory databases , 1996, SAC '96.

[21] Margaret H. Dunham. Main Memory Database Recovery , 1986, FJCC.

[22] Raghu Ramakrishnan,et al. Database Management Systems , 1976 .

[23] Michael Stonebraker,et al. The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[24] Johannes Gehrke,et al. Fast checkpoint recovery algorithms for frequently consistent applications , 2011, SIGMOD '11.

[25] Michael J. Carey,et al. A recovery algorithm for a high-performance memory-resident database system , 1987, SIGMOD '87.

[26] Daniel J. Abadi,et al. Low overhead concurrency control for partitioned main memory databases , 2010, SIGMOD Conference.