论文信息 - Big(ger) sets: decomposed delta CRDT sets in Riak

Big(ger) sets: decomposed delta CRDT sets in Riak

CRDT[24] Sets as implemented in Riak[6] perform poorly for writes, both as cardinality grows, and for sets larger than 500KB[25]. Riak users wish to create high cardinality CRDT sets, and expect better than O(n) performance for individual insert and remove operations. By decomposing a CRDT set on disk, and employing delta-replication[2], we can achieve far better performance than just delta replication alone: relative to the size of causal metadata, not the cardinality of the set, and we can support sets that are 100s times the size of Riak sets, while still providing the same level of consistency. There is a trade-off in read performance but we expect it is mitigated by enabling queries on sets.

[1] Colin J. Fidge,et al. Logical time in distributed computing systems , 1991, Computer.

[2] Marc Shapiro,et al. Conflict-Free Replicated Data Types , 2011, SSS.

[3] Ali Shoker,et al. Efficient State-Based CRDTs by Delta-Mutation , 2014, NETYS.

[4] Paulo Sérgio Almeida,et al. Scalable and Accurate Causality Tracking for Eventually Consistent Stores , 2014, DAIS.

[5] Dahlia Malkhi,et al. Concise version vectors in WinFS , 2005, Distributed Computing.

[6] Marc Shapiro,et al. A comprehensive study of Convergent and Commutative Replicated Data Types , 2011 .

[7] Werner Vogels,et al. Building reliable distributed systems at a worldwide scale demands trade-offs between consistency and availability. , 2022 .

[8] Werner Vogels,et al. Dynamo: amazon's highly available key-value store , 2007, SOSP.

[9] Russell Brown,et al. Riak DT map: a composable, convergent replicated dictionary , 2014, PaPEC '14.

[10] Sérgio Duarte,et al. An optimized conflict-free replicated set , 2012, ArXiv.