Efficient logging for enterprise workloads on column-oriented in-memory databases

The introduction of a 64 bit address space in commodity operating systems and the constant drop in hardware prices made large capacities of main memory in the order of terabytes technically feasible and economically viable. Especially column-oriented in-memory databases are a promising platform to improve data management for enterprise applications. As in-memory databases hold the primary persistence in volatile memory, some form of recovery mechanism is required to prevent potential data loss in case of failures. Two desirable characteristics of any recovery mechanism are (1) that it has a minimal impact on the running system, and (2) that the system recovers quickly and without any data loss after a failure. This paper introduces an efficient logging mechanism for dictionary-compressed column structures that addresses these two characteristics by (1) reducing the overall log size by writing dictionary-compressed values and (2) allowing for parallel writing and reading of log files. We demonstrate the efficiency of our logging approach by comparing the resulting log-file size with traditional logical logging on a workload produced by a productive enterprise system.

[1]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[2]  Johannes Gehrke,et al.  Fast checkpoint recovery algorithms for frequently consistent applications , 2011, SIGMOD '11.

[3]  Abraham Silberschatz,et al.  Incremental Recovery in Main Memory Database Systems , 1992, IEEE Trans. Knowl. Data Eng..

[4]  Raghunath Othayoth Nambiar,et al.  Transaction Processing Performance Council (TPC): State of the Council 2010 , 2010, TPCTC.

[5]  Hasso Plattner,et al.  SanssouciDB: An In-Memory Database for Processing Enterprise Workloads , 2011, BTW.

[6]  Alfons Kemper,et al.  HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[7]  Alexander Zeier,et al.  A Cost-Aware Strategy for Merging Differential Stores in Column-Oriented In-Memory DBMS , 2011, BIRTE.

[8]  Andreas Reuter,et al.  Principles of transaction-oriented database recovery , 1983, CSUR.

[9]  Samuel Madden,et al.  An integrated approach to recovery and high availability in an updatable, distributed data warehouse , 2006, VLDB.

[10]  Kihong Kim,et al.  Differential logging: a commutative and associative logging scheme for highly parallel main memory database , 2001, Proceedings 17th International Conference on Data Engineering.

[11]  Hector Garcia-Molina,et al.  Checkpointing memory-resident databases , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[12]  Robert B. Hagmann A Crash Recovery Scheme for a Memory-Resident Database System , 1986, IEEE Transactions on Computers.

[13]  Alexander Zeier,et al.  Optimizing Write Performance for Read Optimized Databases , 2010, DASFAA.

[14]  Murali Vallath Oracle Real Application Clusters , 2003 .