Maxwell: a hardware and software highly integrated compute-storage system

The compute-storage framework is responsible for data storage and processing, and acts as the digital chassis of all upper-level businesses. The performance of the framework affects the business’s processing throughput, latency, jitter, and etc., and also determines the theoretical performance upper bound that the business can achieve. In financial applications, the compute-storage framework must have high reliability and high throughput, but with low latency as well as low jitter characteristics. For some scenarios such as hot-spot account update, the performance of the compute-storage framework even surfaces to become a server performance bottleneck of the whole business system. In this paper, we study the hot-spot account issue faced by Alipay and present our exciting solution to this problem by developing a new compute-storage system, called Maxwell. Maxwell is a distributed compute-storage system with integrated hardware and software optimizations. Maxwell does not rely on any specific hardware (e.g. GPUs or FPGAs). Instead, it takes deep advantage of computer components’ characteristics, such as disk, network, operating system and CPU, and aims to emit the ultimate performance of both hardware and software. In comparison with the existing hotspot account updating solutions deployed online, Maxwell achieves three orders of magnitude performance improvement for end-toend evaluation. Meanwhile, Maxwell also demonstrates remarkable performance gains in other related businesses of Ant Group.

[1]  Ittai Abraham,et al.  PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees , 2017, SOSP.

[2]  Bingsheng He,et al.  Building an Efficient Put-Intensive Key-Value Store with Skip-Tree , 2017, IEEE Transactions on Parallel and Distributed Systems.

[3]  Richard P. Spillane,et al.  SplinterDB: Closing the Bandwidth Gap for NVMe Key-Value Stores , 2020, USENIX Annual Technical Conference.

[4]  Young Ik Eom,et al.  Partial Tiering: A Hybrid Merge Policy for Log Structured Key-Value Stores , 2021, 2021 IEEE International Conference on Big Data and Smart Computing (BigComp).

[5]  Heon Young Yeom,et al.  An I/O Isolation Scheme for Key-Value Store on Multiple Solid-State Drives , 2019, 2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W).

[6]  Prashant Malik,et al.  Cassandra: a decentralized structured storage system , 2010, OPSR.

[7]  Hong Jiang,et al.  SifrDB: A Unified Solution for Write-Optimized Key-Value Stores in Large Datacenter , 2018, SoCC.

[8]  C. Xie,et al.  A Light-weight Compaction Tree to Reduce I / O Amplification toward Efficient Key-Value Stores , 2017 .

[9]  Fan Guo,et al.  ElasticBF: Elastic Bloom Filter with Hotness Awareness for Boosting Read Performance in Large Key-Value Stores , 2019, USENIX Annual Technical Conference.

[10]  Karan Gupta,et al.  SILK+ Preventing Latency Spikes in Log-Structured Merge Key-Value Stores Running Heterogeneous Workloads , 2020, USENIX Annual Technical Conference.

[11]  John K. Ousterhout,et al.  In Search of an Understandable Consensus Algorithm , 2014, USENIX ATC.

[12]  Qiang Zhang,et al.  UniKV: Toward High-Performance and Scalable KV Storage in Mixed Workloads via Unified Indexing , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[13]  GhemawatSanjay,et al.  The Google file system , 2003 .