A load balancing framework for clustered storage systems

The load balancing framework for high-performance clustered storagesystems presented in this paper provides a general method for reconfiguringa system facing dynamic workload changes. It simultaneously balances load andminimizes the cost of reconfiguration. It can be used for automatic reconfigurationor to present an administrator with a range of (near) optimal reconfigurationoptions, allowing a tradeoff between load distribution and reconfiguration cost.The framework supports a wide range of measures for load imbalance and reconfigurationcost, as well as several optimization techniques. The effectivenessof this framework is demonstrated by balancing the workload on a NetApp DataONTAP GX system, a commercial scale-out clustered NFS server implementation.The evaluation scenario considers consolidating two real world systems,with hundreds of users each: a six-node clustered storage system supporting engineeringworkloads and a legacy system supporting three email severs.

[1]  Gregory R. Ganger,et al.  Towards Self-Predicting Systems: What If You Could Ask "What-If"? , 2005, 16th International Workshop on Database and Expert Systems Applications (DEXA'05).

[2]  John Wilkes,et al.  Traveling to Rome: QoS Specifications for Automated Storage System Management , 2001, IWQoS.

[3]  Gregory R. Ganger,et al.  Informed data distribution selection in a self-predicting storage system , 2006, 2006 IEEE International Conference on Autonomic Computing.

[4]  Richard Mortier,et al.  Using Magpie for Request Extraction and Workload Modelling , 2004, OSDI.

[5]  Dirk Beyer,et al.  Don't Settle for Less Than the Best: Use Optimization to Make Decisions , 2007, HotOS.

[6]  Peter Corbett,et al.  Data ONTAP GX: A Scalable Storage Cluster , 2007, FAST.

[7]  Gregory R. Ganger,et al.  Towards self-predicting systems: What if you could ask ‘what-if’? , 2006, The Knowledge Engineering Review.

[8]  Witold Litwin,et al.  Linear Hashing: A new Algorithm for Files and Tables Addressing , 1980, ICOD.

[9]  E. Anderson HPL – SSP – 2001 – 4 : Simple table-based modeling of storage devices , 2001 .

[10]  Gregory R. Ganger,et al.  Modeling the relative fitness of storage , 2007, SIGMETRICS '07.

[11]  Tao Yang,et al.  The Panasas ActiveScale Storage Cluster - Delivering Scalable High Bandwidth Storage , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[12]  Gregory R. Ganger,et al.  Ursa minor: versatile cluster-based storage , 2005, FAST'05.

[13]  Eric Anderson,et al.  Quickly finding near-optimal storage designs , 2005, TOCS.

[14]  Arif Merchant,et al.  Minerva: An automated resource provisioning tool for large-scale storage systems , 2001, TOCS.

[15]  Julio César López-Hernández,et al.  Stardust: tracking activity in a distributed storage system , 2006, SIGMETRICS '06/Performance '06.

[16]  Eric Anderson,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Hippodrome: Running Circles around Storage Administration , 2022 .

[17]  Chenyang Lu,et al.  Proceedings of the Fast 2002 Conference on File and Storage Technologies Aqueduct: Online Data Migration with Performance Guarantees , 2022 .

[18]  Arif Merchant,et al.  FAB: building distributed enterprise disk arrays from commodity components , 2004, ASPLOS XI.

[19]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[20]  Marco Laumanns,et al.  SPEA2: Improving the Strength Pareto Evolutionary Algorithm For Multiobjective Optimization , 2002 .