Workload balancing and adaptive resource management for the swift storage system on cloud

The demand for big data storage and processing has become a challenge in today's industry. To meet the challenge, there is an increasing number of enterprises adopting distributed storage systems. Frequently, in these systems, storage nodes intensively holding hotspot data could become system bottlenecks while storage nodes without hotspot data might result in low utilization of computing resource. This stems from the fact that almost all the typical distributed storage systems only provide data-amount-oriented balancing mechanisms without considering the different access load of data. To eliminate the system bottlenecks and optimize the resource utilization, there is a demand for such distributed storage systems to employ a workload balancing and adaptive resource management framework. In this paper, we propose a framework of workload balancing and resource management for Swift, a widely used and typical distributed storage system on cloud. In this framework, we design workload monitoring and analysis algorithms for discovering overloaded and underloaded nodes in the cluster. To balance the workload among those nodes, Split, Merge and Pair Algorithms are implemented to regulate physical machines while Resource Reallocate Algorithm is designed to regulate virtual machines on cloud. In addition, by leveraging the mature architecture of distributed storage systems, the framework resides in the hosts and operates through API interception. To demonstrate its effectiveness, we conduct experiments to evaluate it. And the experimental results show the framework can achieve its goals. We propose a workload balancing and adaptive resource management framework for Swift.We implement optimization algorithms of dynamic workload balancing for Swift.We conduct an experiment to demonstrate the effectiveness of this framework.

[1]  Bruce A. Draper,et al.  High-Level Language Abstraction for Reconfigurable Computing , 2003, Computer.

[2]  Stefan Becker,et al.  Workload-aware System Monitoring Using Performance Predictions Applied to a Large-scale E-Mail System , 2012, 2012 Joint Working IEEE/IFIP Conference on Software Architecture and European Conference on Software Architecture.

[3]  Hiroshi Yamamoto,et al.  Replication methods for load balancing on distributed storages in P2P networks , 2005, The 2005 Symposium on Applications and the Internet.

[4]  Yang Yu,et al.  A Balanced Allocation Strategy for File Assignment in Parallel I/O Systems , 2010, 2010 IEEE Fifth International Conference on Networking, Architecture, and Storage.

[5]  Xianghua Deng,et al.  An Improved Ant-Based Algorithm for the Degree-Constrained Minimum Spanning Tree Problem , 2012, IEEE Transactions on Evolutionary Computation.

[6]  Marianne Winslett,et al.  Enhancing data migration performance via parallel data compression , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[7]  Tao Xie,et al.  A static data placement strategy towards perfect load-balancing for distributed storage clusters , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[8]  Shicong Meng,et al.  Reliable State Monitoring in Cloud Datacenters , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[9]  Rynson W. H. Lau,et al.  Heat diffusion based dynamic load balancing for distributed virtual environments , 2010, VRST '10.

[10]  Jing Zhang,et al.  An XML Data Placement Strategy for Distributed XML Storage and Parallel Query , 2011, 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies.

[11]  Lachlan L. H. Andrew,et al.  Dynamic Right-Sizing for Power-Proportional Data Centers , 2011, IEEE/ACM Transactions on Networking.

[12]  Gong Zhang,et al.  Adaptive Data Migration in Multi-tiered Storage Based Cloud Environment , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[13]  Poo Kuan Hoong,et al.  Heuristic neighbor selection algorithm for decentralized load balancing in clustered heterogeneous computational environment , 2012, 2012 14th International Conference on Advanced Communication Technology (ICACT).

[14]  R. K. Saranya,et al.  Improving Accessing Efficiency of Cloud Storage Using De- Duplication and Feedback Schemes , 2015 .

[15]  Jens Lüssem,et al.  How to make data migration processes more efficient by using TOGAF: Best practice data migration approach applied to SAP Financial Services-Policy Management , 2013, 2013 ACS International Conference on Computer Systems and Applications (AICCSA).

[16]  Xiaohu Yang,et al.  Data Based Application Partitioning and Workload Balance in Distributed Environment , 2006, 2006 International Conference on Software Engineering Advances (ICSEA'06).

[17]  Xin Zhao,et al.  Towards a Cost-Aware Data Migration Approach for Key-Value Stores , 2012, 2012 IEEE International Conference on Cluster Computing.

[18]  Qingsong Wei,et al.  DifferStore: A differentiated storage service in object-based storage system , 2008, 2008 IEEE International Conference on Cluster Computing.

[19]  Randal C. Burns,et al.  Improving I/O Performance of Clustered Storage Systems by Adaptive Request Distribution , 2006, 2006 15th IEEE International Conference on High Performance Distributed Computing.

[20]  Nancy M. Amato,et al.  Quantifying the effectiveness of load balance algorithms , 2012, ICS '12.

[21]  Laurence T. Yang,et al.  Static workload balance scheduling; continuous case , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[22]  Frédéric Vivien,et al.  Load-balancing scatter operations for Grid computing , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[23]  Yi Jin,et al.  Research on the improvement of MongoDB Auto-Sharding in cloud environment , 2012, 2012 7th International Conference on Computer Science & Education (ICCSE).

[24]  Thomas G. Robertazzi,et al.  Ten Reasons to Use Divisible Load Theory , 2003, Computer.

[25]  Lachlan L. H. Andrew,et al.  Greening geographical load balancing , 2011, PERV.

[26]  Feng Dan,et al.  DLBS: Duplex Loading Balancing Strategy on Object Storage System , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[27]  Vijay K. Naik,et al.  Workload Monitoring in Hybrid Clouds , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[28]  Lingyun Yang,et al.  Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments , 2003, ACM/IEEE SC 2003 Conference (SC'03).