Fast Failure Recovery for Main-Memory DBMSs on Multicores

Main-memory database management systems (DBMS) can achieve excellent performance when processing massive volume of on-line transactions on modern multi-core machines. But existing durability schemes, namely, tuple-level and transaction-level logging-and-recovery mechanisms, either degrade the performance of transaction processing or slow down the process of failure recovery. In this paper, we show that, by exploiting application semantics, it is possible to achieve speedy failure recovery without introducing any costly logging overhead to the execution of concurrent transactions. We propose PACMAN, a parallel database recovery mechanism that is specifically designed for lightweight, coarse-grained transaction-level logging. PACMAN leverages a combination of static and dynamic analyses to parallelize the log recovery: at compile time, PACMAN decomposes stored procedures by carefully analyzing dependencies within and across programs; at recovery time, PACMAN exploits the availability of the runtime parameter values to attain an execution schedule with a high degree of parallelism. As such, recovery performance is remarkably increased. We evaluated PACMAN in a fully-fledged main-memory DBMS running on a 40-core machine. Compared to several state-of-the-art database recovery mechanisms, can significantly reduce recovery time without compromising the efficiency of transaction processing.

[1]  References , 1971 .

[2]  Martin Grund,et al.  Efficient Transaction Processing for Hyrise in Mixed Workload Environments , 2014, IMDM@VLDB.

[3]  Lin Ma,et al.  Self-Driving Database Management Systems , 2017, CIDR.

[4]  Andrew Pavlo,et al.  An Empirical Evaluation of In-Memory Multi-Version Concurrency Control , 2017, Proc. VLDB Endow..

[5]  Kihong Kim,et al.  Differential logging: a commutative and associative logging scheme for highly parallel main memory database , 2001, Proceedings 17th International Conference on Data Engineering.

[6]  Bjarne Steensgaard,et al.  Points-to analysis in almost linear time , 1996, POPL '96.

[7]  Craig Freedman,et al.  Hekaton: SQL server's memory-optimized OLTP engine , 2013, SIGMOD '13.

[8]  Christopher Frost,et al.  Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[9]  Alexander Zeier,et al.  HYRISE - A Main Memory Hybrid Storage Engine , 2010, Proc. VLDB Endow..

[10]  Ippokratis Pandis,et al.  Data-oriented transaction execution , 2010, Proc. VLDB Endow..

[11]  Michael Stonebraker,et al.  Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[12]  Ippokratis Pandis,et al.  ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads , 2016, SIGMOD Conference.

[13]  Patrick Valduriez,et al.  Transaction chopping: algorithms and performance studies , 1995, TODS.

[14]  S. Sudarshan,et al.  Program analysis and transformation for holistic optimization of database applications , 2012, SOAP '12.

[15]  Alvin Cheung,et al.  Automatic Partitioning of Database Applications , 2012, Proc. VLDB Endow..

[16]  Raghu Ramakrishnan,et al.  Database Management Systems , 1976 .

[17]  Andreas Reuter,et al.  Transaction Processing: Concepts and Techniques , 1992 .

[18]  Cody Cutler,et al.  Phase Reconciliation for Contended In-Memory Transactions , 2014, OSDI.

[19]  Ryan Johnson,et al.  Scalable Logging through Emerging Non-Volatile Memory , 2014, Proc. VLDB Endow..

[20]  Flemming Nielson,et al.  Principles of Program Analysis , 1999, Springer Berlin Heidelberg.

[21]  Yang Zhang,et al.  Extracting More Concurrency from Distributed Transactions , 2014, OSDI.

[22]  Daniel J. Abadi,et al.  Low-Overhead Asynchronous Checkpointing in Main-Memory Database Systems , 2016, SIGMOD Conference.

[23]  Todd M. Austin,et al.  Dynamic dependency analysis of ordinary programs , 1992, ISCA '92.

[24]  Kian-Lee Tan,et al.  Transaction Healing: Scaling Optimistic Concurrency Control on Multicores , 2016, SIGMOD Conference.

[25]  Hector Garcia-Molina,et al.  Checkpointing memory-resident databases , 1989, [1989] Proceedings. Fifth International Conference on Data Engineering.

[26]  Daniel J. Abadi,et al.  The case for determinism in database systems , 2010, Proc. VLDB Endow..

[27]  Alfons Kemper,et al.  HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[28]  Hamid Pirahesh,et al.  ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[29]  David B. Lomet,et al.  Implementing Performance Competitive Logical Recovery , 2011, Proc. VLDB Endow..

[30]  Ippokratis Pandis,et al.  Aether: A Scalable Approach to Logging , 2010, Proc. VLDB Endow..

[31]  Mendel Rosenblum,et al.  Fast crash recovery in RAMCloud , 2011, SOSP.

[32]  Antoni Wolski,et al.  SIREN: A Memory-Conserving, Snapshot-Consistent Checkpoint Algorithm for in-Memory Databases , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[33]  Michael Stonebraker,et al.  Rethinking main memory OLTP recovery , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[34]  Kian-Lee Tan,et al.  Scalable In-Memory Transaction Processing with HTM , 2016, USENIX Annual Technical Conference.

[35]  Eddie Kohler,et al.  Speedy transactions in multicore in-memory databases , 2013, SOSP.

[36]  Carlo Curino,et al.  OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases , 2013, Proc. VLDB Endow..

[37]  Babak Falsafi,et al.  Shore-MT: a scalable storage manager for the multicore era , 2009, EDBT '09.

[38]  Xi Li,et al.  Post-crash log processing for fuzzy checkpointing main memory databases , 1993, Proceedings of IEEE 9th International Conference on Data Engineering.

[39]  Michael Stonebraker,et al.  H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[40]  Alvin Cheung,et al.  Leveraging Lock Contention to Improve OLTP Application Performance , 2016, Proc. VLDB Endow..

[41]  Frances E. Allen,et al.  Control-flow analysis , 2022 .

[42]  Daniel J. Abadi,et al.  Calvin: fast distributed transactions for partitioned database systems , 2012, SIGMOD Conference.

[43]  Ippokratis Pandis,et al.  A data-oriented transaction execution engine and supporting tools , 2011, SIGMOD '11.

[44]  Gang Chen,et al.  Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases , 2016, SIGMOD Conference.

[45]  Marcos K. Aguilera,et al.  Transaction chains: achieving serializability with low latency in geo-distributed storage systems , 2013, SOSP.

[46]  Eddie Kohler,et al.  Fast Databases with Fast Durability and Recovery Through Multicore Parallelism , 2014, OSDI.

[47]  Michael Stonebraker,et al.  The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[48]  Johannes Gehrke,et al.  Fast checkpoint recovery algorithms for frequently consistent applications , 2011, SIGMOD '11.

[49]  Hector Garcia-Molina,et al.  Using semantic knowledge for transaction processing in a distributed database , 1983, TODS.