On Optimizing Traffic Imbalance in Large-scale Block-based Cloud Storage: Trace Analysis and Algorithm Design

Cloud block storage (CBS) serves as the fundamental infrastructure of modern cloud computing services like the cloud disk service. Large-scale cloud block storage usually adopts a layered architecture, including a forwarding layer with a cluster of proxy servers as proxies to provide cloud disk abstraction, and a unified distributed storage engine providing persisted data storage. However, as all I/O traffics go through the proxy servers in the forwarding layer, there may be a severe traffic imbalance between the proxy servers, which finally degrades the performance of cloud disks. To investigate the traffic imbalance problem in the forwarding layer, we first conduct an in-depth analysis on the workload traces of a large-scale cloud block storage system in production. We find that both the traffic of individual cloud disks and the consolidated traffic of cloud disks at proxy servers are highly skewed and fluctuate violently and frequently at a fine-grained time granularity, and thus causing severe traffic imbalance. To address the traffic imbalance issue, we then develop a low-cost migration algorithm, weighted partial migration (WPM), and conduct simulation analysis via trace replay to study its effectiveness. Experiments under real-world workloads show that for 84.3% of clusters, WPM can make the imbalance factor be smaller than 3 (i.e., the maximum traffic at a proxy server is within 3$\times$ of the median traffic), with a very small migration cost by migrating only 0.1% segments.

[1]  Patrick P. C. Lee,et al.  An In-Depth Analysis of Cloud Block Storage Workloads in Large-Scale Production , 2020, 2020 IEEE International Symposium on Workload Characterization (IISWC).

[2]  Mor Harchol-Balter,et al.  Borg: the next generation , 2020, EuroSys.

[3]  Wei Wang,et al.  Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud , 2019, SoCC.

[4]  Jing Guo,et al.  Who Limits the Resource Efficiency of My Datacenter: An Analysis of Alibaba Datacenter Traces , 2019, 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS).

[5]  Kai Chen,et al.  URSA: Hybrid Block Storage for Cloud-Scale Virtual Disks , 2019, EuroSys.

[6]  Ethan L. Schreiber Optimal Multi-Way Number Partitioning , 2018, J. ACM.

[7]  Benjamin Letham,et al.  Forecasting at Scale , 2018, PeerJ Prepr..

[8]  Ricardo Bianchini,et al.  Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms , 2017, SOSP.

[9]  Mariko Sugawara,et al.  Understanding storage traffic characteristics on enterprise virtual desktop infrastructure , 2017, SYSTOR.

[10]  Andrea C. Arpaci-Dusseau,et al.  Slacker: Fast Distribution with Lazy Docker Containers , 2016, FAST.

[11]  Rajkumar Buyya,et al.  Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS , 2015, IEEE Transactions on Cloud Computing.

[12]  Wei Lin,et al.  Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing , 2014, OSDI.

[13]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[14]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[15]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[16]  Dutch T. Meyer,et al.  Capo: Recapitulating Storage for Virtual Desktops , 2011, FAST.

[17]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[18]  Qi Zhang,et al.  Characterization of storage workload traces from production Windows Servers , 2008, 2008 IEEE International Symposium on Workload Characterization.

[19]  S.A. Brandt,et al.  CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , 2006, ACM/IEEE SC 2006 Conference (SC'06).

[20]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[21]  Dirk Grunwald,et al.  A performance analysis of the iSCSI protocol , 2003, 20th IEEE/11th NASA Goddard Conference on Mass Storage Systems and Technologies, 2003. (MSST 2003). Proceedings..

[22]  Peter T. Breuer,et al.  The Network Block Device , 2000 .

[23]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[24]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[25]  Yiming Zhang,et al.  MAPX: Controlled Data Migration in the Expansion of Decentralized Object-Based Storage Systems , 2020, FAST.

[26]  Mohamed Mohamed,et al.  Improving Docker Registry Design Based on Production Workload Analysis , 2018, FAST.

[27]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.