An Empirical Comparative Study of Decentralized Load Balancing Algorithms in Clustered Storage Environment

Load balance is critical for large-scale storage systems to produce high I/O performance. Decentralized solutions are especially preferred for no single point of bottleneck. We implement four typical hypercube-based decentralized load balancing algorithms in a prototype storage system, and conduct extensive experiments with the system running on a testbed comprising 32 nodes. We compare the efficiency and scalability of the four algorithms through the experiments. The comparison results lead to the following new observations contrary to the conclusions obtained in previous simulation studies. Firstly, algorithms with no redundant load migration do not actually achieve savings of migration costs. Secondly, algorithms tolerating a certain degree of redundancy in load migration may achieve improvements in scalability. The two observations provide new insights into the design of load balancing algorithms in distributed storage systems.

[1]  Ajay Dholakia,et al.  A new intra-disk redundancy scheme for high-reliability RAID storage systems in the presence of unrecoverable errors , 2006, TOS.

[2]  M Cnii DBDS:A FULLY DISTRIBUTED ALGORITHM FOR DATA MIGRATION , 2011 .

[3]  Michael I. Jordan,et al.  Automating Datacenter Operations Using Machine Learning , 2010 .

[4]  Joseph Hall,et al.  On algorithms for efficient data migration , 2001, SODA '01.

[5]  Min-You Wu,et al.  A load-balancing algorithm for N-cubes , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[6]  Lawrence W. Dowdy,et al.  Comparative Models of the File Assignment Problem , 1982, CSUR.

[7]  Sartaj Sahni,et al.  Programming a hypercube multicomputer , 1988, IEEE Software.

[8]  D. M. Nicol,et al.  Communication efficient global load balancing , 1992, Proceedings Scalable High Performance Computing Conference SHPCC-92..

[9]  所 真理雄,et al.  20th ACM Symposium on Operating Systems Principles , 1986, SOSP '05.

[10]  Junfeng Yang,et al.  Kinesis: A new approach to replica placement in distributed storage systems , 2009, TOS.

[11]  Tao Xie,et al.  A static data placement strategy towards perfect load-balancing for distributed storage clusters , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[12]  Peter Scheuermann,et al.  File Assignment in Parallel I/O Systems with Minimal Variance of Service Time , 2000, IEEE Trans. Computers.

[13]  Richard A. Golding,et al.  D-SPTF: decentralized request distribution in brick-based storage systems , 2004, ASPLOS XI.

[14]  Alexander Russell,et al.  Data Migration in Heterogeneous Storage Systems , 2011, 2011 31st International Conference on Distributed Computing Systems.

[15]  Dan Feng,et al.  CDRM: A Cost-Effective Dynamic Replication Management Scheme for Cloud Storage Cluster , 2010, 2010 IEEE International Conference on Cluster Computing.

[16]  Changsheng Xie,et al.  A decentralized storage cluster with high reliability and flexibility , 2006, 14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP'06).

[17]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[18]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[19]  Wenfeng Wang,et al.  A Novel Network Storage Scheme: Intelligent Network Disk Storage Cluster , 2008, 2008 IEEE International Conference on Networking, Sensing and Control.