论文信息 - Aether: A Scalable Approach to Logging - 字舞流文

Aether: A Scalable Approach to Logging

The shift to multi-core hardware brings new challenges to database systems, as the software parallelism determines performance. Even though database systems traditionally accommodate simultaneous requests, a multitude of synchronization barriers serialize execution. Write-ahead logging is a fundamental, omnipresent component in ARIES-style concurrency and recovery, and one of the most important yet-to-be addressed potential bottlenecks, especially in OLTP workloads making frequent small changes to data. In this paper, we identify four logging-related impediments to database system scalability. Each issue challenges different level in the software architecture: (a) the high volume of small-sized I/O requests may saturate the disk, (b) transactions hold locks while waiting for the log flush, (c) extensive context switching overwhelms the OS scheduler with threads executing log I/Os, and (d) contention appears as transactions serialize accesses to in-memory log data structures. We demonstrate these problems and address them with techniques that, when combined, comprise a holistic, scalable approach to logging. Our solution achieves a 20%-69% speedup over a modern database system when running log-intensive workloads, such as the TPC-B and TATP benchmarks. Moreover, it achieves log insert throughput over 1.8GB/s for small log records on a single socket server, an order of magnitude higher than the traditional way of accessing the log using a single mutex.

Ippokratis Pandis | Ryan Johnson | Anastasia Ailamaki | Manos Athanassoulis | Radu Stoica | Manos Athanassoulis | A. Ailamaki | R. Stoica | Ryan Johnson | I. Pandis | Ippokratis Pandis

[1] Michael Stonebraker,et al. Implementation techniques for main memory database systems , 1984, SIGMOD '84.

[2] William E. Weihl,et al. What Good are Concurrent Search Structure Algorithms for databases Anyway? , 1985, IEEE Database Eng. Bull..

[3] Andreas Reuter,et al. Group Commit Timers and High Volume Transaction Systems , 1987, HPTS.

[4] Abbas Rafii,et al. Performance Tradeoffs of Group Commit Logging , 1989, Int. CMG Conference.

[5] C. Mohan,et al. ARIES/KVL: A Key-Value Locking Method for Concurrency Control of Multiaction Transactions Operating on B-Tree Indexes , 1990, VLDB.

[6] Peter M. Spiro. How the Rdb � VMS Data Sharing System Became Fast , 1992 .

[7] Hamid Pirahesh,et al. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging , 1998 .

[8] David J. DeWitt,et al. Shoring up persistent applications , 1994, SIGMOD '94.

[9] Eljas Soisalon-Soininen,et al. Partial Strictness in Two-Phase Locking , 1995, ICDT.

[10] Nir Shavit,et al. Elimination Trees and the Construction of Pools and Stacks , 1997, Theory of Computing Systems.

[11] Y. Oyama,et al. EXECUTING PARALLEL PROGRAMS WITH SYNCHRONIZATION BOTTLENECKS EFFICIENTLY , 1999 .

[12] Sashikanth Chandrasekaran,et al. Cache Fusion: Extending Shared-Disk Clusters with Shared Caches , 2001, VLDB.

[13] Michael L. Scott,et al. Non-blocking timeout in scalable queue-based spin locks , 2002, PODC '02.

[14] David B. Lomet. Recovery for Shared Disk Systems Using Multiple Redo Logs , 2002 .

[15] Babak Falsafi,et al. Database Servers on Chip Multiprocessors: Limitations and Opportunities , 2007, CIDR.

[16] Michael Stonebraker,et al. The End of an Architectural Era (It's Time for a Complete Rewrite) , 2007, VLDB.

[17] Pat Helland,et al. Life beyond Distributed Transactions: an Apostate's Opinion , 2007, CIDR.

[18] Jae-Myung Kim,et al. A case for flash memory ssd in enterprise database applications , 2008, SIGMOD Conference.

[19] Michael Stonebraker,et al. OLTP through the looking glass, and what we found there , 2008, SIGMOD Conference.

[20] Philippe Bonnet,et al. uFLIP: Understanding Flash IO Patterns , 2009, CIDR.

[21] Shimin Chen,et al. FlashLogging: exploiting flash devices for synchronous logging performance , 2009, SIGMOD Conference.

[22] Babak Falsafi,et al. Shore-MT: a scalable storage manager for the multicore era , 2009, EDBT '09.

[23] Ippokratis Pandis,et al. Improving OLTP Scalability using Speculative Lock Inheritance , 2009, Proc. VLDB Endow..