论文信息 - Exploiting Nil-Externality for Fast Replicated Storage

Exploiting Nil-Externality for Fast Replicated Storage

Do some storage interfaces enable higher performance than others? Can one identify and exploit such interfaces to realize high performance in storage systems? This paper answers these questions in the affirmative by identifying nil-externality, a property of storage interfaces. A nil-externalizing (nilext) interface may modify state within a storage system but does not externalize its effects or system state immediately to the outside world. As a result, a storage system can apply nilext operations lazily, improving performance. In this paper, we take advantage of nilext interfaces to build high-performance replicated storage. We implement Skyros, a nilext-aware replication protocol that offers high performance by deferring ordering and executing operations until their effects are externalized. We show that exploiting nil-externality offers significant benefit: for many workloads, Skyros provides higher performance than standard consensus-based replication. For example, Skyros offers 3x lower latency while providing the same high throughput offered by throughput-optimized Paxos.

Andrea C. Arpaci-Dusseau | Remzi H. Arpaci-Dusseau | Aishwarya Ganesan | Ramnatthan Alagappan

[1] Fernando Pedone,et al. P4xos: Consensus as a Network Service , 2020, IEEE/ACM Transactions on Networking.

[2] Satoshi Matsushita,et al. Implementing linearizability at large scale and low latency , 2015, SOSP.

[3] John K. Ousterhout,et al. In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.

[4] Wyatt Lloyd,et al. Gryff: Unifying Consensus and Shared Registers , 2020, NSDI.

[5] Siying Dong,et al. MyRocks , 2020, Proc. VLDB Endow..

[6] Michael Williams,et al. Replication in the harp file system , 1991, SOSP '91.

[7] Fernando Pedone,et al. NetPaxos: consensus at network speed , 2015, SOSR.

[8] Jialin Li,et al. Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering , 2016, OSDI.

[9] Ramakrishna Kotla,et al. Zyzzyva: speculative byzantine fault tolerance , 2007, TOCS.

[10] Bernard Wong,et al. Domino: using network measurements to reduce state machine replication latency in WANs , 2020, CoNEXT.

[11] Michael J. Freedman,et al. Who's Afraid of Uncorrectable Bit Errors? Online Recovery of Flash Errors with Distributed Redundancy , 2019, USENIX Annual Technical Conference.

[12] Mendel Rosenblum,et al. It's Time for Low Latency , 2011, HotOS.

[13] Michael A. Bender,et al. BetrFS: A Right-Optimized Write-Optimized File System , 2015, FAST.

[14] Jialin Li,et al. Designing Distributed Systems Using Approximate Synchrony in Data Center Networks , 2015, NSDI.

[15] Gustavo Alonso,et al. Processing transactions over optimistic atomic broadcast protocols , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[16] Patrick E. O'Neil,et al. The log-structured merge-tree (LSM-tree) , 1996, Acta Informatica.

[17] Marcos K. Aguilera,et al. Consistency-based service level agreements for cloud storage , 2013, SOSP.

[18] Barbara Liskov,et al. Granola: Low-Overhead Distributed Transaction Coordination , 2012, USENIX Annual Technical Conference.

[19] Gerth Stølting Brodal,et al. Lower bounds for external memory dictionaries , 2003, SODA '03.

[20] Keith Marzullo,et al. Mencius: Building Efficient Replicated State Machine for WANs , 2008, OSDI.

[21] John K. Ousterhout,et al. Exploiting Commutativity For Practical Fast Replication , 2017, NSDI.

[22] Russel Sandberg,et al. The Sun Network Filesystem: Design, Implementation and Experience , 2001 .

[23] Murat Demirbas,et al. Linearizable Quorum Reads in Paxos , 2019, HotStorage.

[24] Jason Flinn,et al. Tolerating Latency in Replicated State Machines Through Client Speculation , 2009, NSDI.

[25] Robbert van Renesse,et al. Chain Replication for Supporting High Throughput and Availability , 2004, OSDI.

[26] Liuba Shrira,et al. HQ replication: a hybrid quorum protocol for byzantine fault tolerance , 2006, OSDI '06.

[27] Christoph Koch,et al. Quantum Databases , 2013, CIDR.