Building efficient and available distributed transaction with Paxos-based coding consensus

Supporting distributed transaction is a key function for large-scale database systems. Conventional database systems build distributed transactions on the top of replication storage to provide high availability. However, replication induces a large amount of storage overhead. In this paper, we make the first attempt to build highly available distributed transactions over erasure coding to achieve high storage efficiency. We propose Eunice, an Efficient and available distributed transaction protocol that Unifies Concurrency control and Erasure coding. In Eunice, we first design a single-layered coding update mechanism to optimize transaction latency. Then we propose a Paxos-based coding consensus protocol to provide fault-tolerance and strong consistency for coding update operation. Compare with conventional distributed transaction protocol with replication, Eunice can save up to 41.9% storage consumption, while achieving comparable throughput and latency performance.

[1]  Ju Wang,et al.  Windows Azure Storage: a highly available cloud storage service with strong consistency , 2011, SOSP.

[2]  Brad Fitzpatrick,et al.  Distributed caching with memcached , 2004 .

[3]  Werner Vogels,et al.  Dynamo: amazon's highly available key-value store , 2007, SOSP.

[4]  Prashant Malik,et al.  Cassandra: structured storage system on a P2P network , 2009, PODC '09.

[5]  Witold Litwin,et al.  LH*RS---a highly-available scalable distributed data structure , 2005, TODS.

[6]  Patrick P. C. Lee,et al.  Erasure coding for small objects in in-memory KV storage , 2017, SYSTOR.

[7]  Tim Kraska,et al.  MDCC: multi-data center consistency , 2012, EuroSys '13.

[8]  Arvind Krishnamurthy,et al.  Building consistent transactions with inconsistent replication , 2015, SOSP.

[9]  Frank Dabek,et al.  Large-scale Incremental Processing Using Distributed Transactions and Notifications , 2010, OSDI.

[10]  Idit Keidar,et al.  Omid, Reloaded: Scalable and Highly-Available Transaction Processing , 2017, FAST.

[11]  Christina Freytag,et al.  The Definitive Guide To Mongodb The Nosql Database For Cloud And Desktop Computing , 2016 .

[12]  Wilson C. Hsieh,et al.  Usenix Association 10th Usenix Symposium on Operating Systems Design and Implementation (osdi '12) 251 Spanner: Google's Globally-distributed Database , 2022 .

[13]  Dimitris S. Papailiopoulos,et al.  XORing Elephants: Novel Erasure Codes for Big Data , 2013, Proc. VLDB Endow..

[14]  Leslie Lamport,et al.  Paxos Made Simple , 2001 .

[15]  Heng Zhang,et al.  Efficient and Available In-Memory KV-Store with Hybrid Erasure Coding and Replication , 2016, FAST.

[16]  F. Moore,et al.  Polynomial Codes Over Certain Finite Fields , 2017 .