论文信息 - Performance-Optimal Read-Only Transactions - 字舞流文

Performance-Optimal Read-Only Transactions

Read-only transactions are critical for consistently reading data spread across a distributed storage system but have worse performance than simple, non-transactional reads. We identify three properties of simple reads that are necessary for read-only transactions to be performance-optimal, i.e., come as close as possible to simple reads. We demonstrate a fundamental tradeoff in the design of read-only transactions by proving that performance optimality is impossible to achieve with strict serializability, the strongest consistency. Guided by this result, we present PORT, a performanceoptimal design with the strongest consistency to date. Central to PORT are version clocks, a specialized logical clock that concisely captures the necessary ordering constraints. We show the generality of PORT with two applications. Scylla-PORT provides process-ordered serializability with simple writes and shows performance comparable to its nontransactional base system. Eiger-PORT provides causal consistency with write transactions and significantly improves the performance of its transactional base system.

Wyatt Lloyd | Haonan Lu | Siddhartha Sen | S. Sen | Wyatt Lloyd | Haonan Lu

[1] Annette Bieniusa,et al. Cure: Strong Semantics Meets High Availability and Low Latency , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[2] Garth A. Gibson,et al. PRObE: A Thousand-Node Experimental Cluster for Computer Systems Research , 2013, login Usenix Mag..

[3] Shuai Mu,et al. The SNOW Theorem and Latency-Optimal Read-Only Transactions , 2016, OSDI.

[4] Willy Zwaenepoel,et al. GentleRain: Cheap and Scalable Causal Consistency with Physical Clocks , 2014, SoCC.

[5] Qian Li,et al. Arachne: Core-Aware Thread Management , 2018, OSDI.

[6] Frank Dabek,et al. Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[7] Fernando Pedone,et al. Callinicos: Robust Transactional Storage for Distributed Data Structures , 2016, USENIX Annual Technical Conference.

[8] Sudipta Sengupta,et al. High Performance Transactions in Deuteronomy , 2015, CIDR.

[9] Prashant Malik,et al. Cassandra: a decentralized structured storage system , 2010, OPSR.

[10] Marcos K. Aguilera,et al. Yesquel: scalable sql storage for web applications , 2014, SOSP.

[11] Tony Tung,et al. Scaling Memcache at Facebook , 2013, NSDI.

[12] Jinyang Li,et al. Consolidating Concurrency Control and Consensus for Commits under Conflicts , 2016, OSDI.

[13] Nancy A. Lynch,et al. Consensus in the presence of partial synchrony , 1988, JACM.

[14] Lorenzo Alvisi,et al. I Can't Believe It's Not Causal! Scalable Causal Consistency with No Slowdown Cascades , 2017, NSDI.

[15] Norman May,et al. Distributed snapshot isolation: global transactions pay globally, local transactions pay locally , 2014, The VLDB Journal.

[16] Marcos K. Aguilera,et al. Consistency-based service level agreements for cloud storage , 2013, SOSP.

[17] Mike Hibler,et al. An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[18] J. T. Robinson,et al. On optimistic methods for concurrency control , 1979, TODS.

[19] Sanjeev Kumar,et al. Existential consistency: measuring and understanding consistency at Facebook , 2015, SOSP.

[20] Michael J. Freedman,et al. Stronger Semantics for Low-Latency Geo-Replicated Storage , 2013, NSDI.

[21] Sameh Elnikety,et al. Orbe: scalable causal consistency using dependency matrices and physical clocks , 2013, SoCC.

[22] Nancy A. Lynch,et al. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services , 2002, SIGA.

[23] Christos H. Papadimitriou,et al. The serializability of concurrent database updates , 1979, JACM.

[24] Yang Zhang,et al. Extracting More Concurrency from Distributed Transactions , 2014, OSDI.

[25] Barbara Liskov,et al. Granola: Low-Overhead Distributed Transaction Coordination , 2012, USENIX Annual Technical Conference.

[26] Leslie Lamport,et al. Time, clocks, and the ordering of events in a distributed system , 1978, CACM.

[27] David P. Reed,et al. Implementing atomic actions on decentralized data , 1983, TOCS.

[28] Ali Ghodsi,et al. Scalable atomic visibility with RAMP transactions , 2014, SIGMOD Conference.

[29] Yang Wang,et al. wPerf: Generic Off-CPU Analysis to Identify Bottleneck Waiting Events , 2018, OSDI.

[30] Michael J. Freedman,et al. Don't settle for eventual: scalable causal consistency for wide-area storage with COPS , 2011, SOSP.

[31] Christopher Frost,et al. Spanner: Google's Globally-Distributed Database , 2012, OSDI.

[32] Hui Ding,et al. TAO: Facebook's Distributed Data Store for the Social Graph , 2013, USENIX Annual Technical Conference.

[33] Nancy A. Lynch,et al. Impossibility of distributed consensus with one faulty process , 1983, PODS '83.

[34] Maurice Herlihy,et al. Linearizability: a correctness condition for concurrent objects , 1990, TOPL.

[35] Adam Silberstein,et al. Benchmarking cloud serving systems with YCSB , 2010, SoCC '10.

[36] Leslie Lamport,et al. How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs , 2016, IEEE Transactions on Computers.

[37] Rachid Guerraoui,et al. Causal Consistency and Latency Optimality: Friend or Foe? , 2018, Proc. VLDB Endow..

[38] Arvind Krishnamurthy,et al. Building consistent transactions with inconsistent replication , 2015, SOSP.

[39] João Leitão,et al. ChainReaction: a causal+ consistent datastore based on chain replication , 2013, EuroSys '13.

[40] Fred B. Schneider,et al. Implementing fault-tolerant services using the state machine approach: a tutorial , 1990, CSUR.

[41] Michael Stonebraker,et al. H-store: a high-performance, distributed main memory transaction processing system , 2008, Proc. VLDB Endow..

[42] Satoshi Matsushita,et al. Implementing linearizability at large scale and low latency , 2015, SOSP.

[43] Marcos K. Aguilera,et al. Sinfonia: a new paradigm for building scalable distributed systems , 2007, SOSP.