Ursa: Scalable Load and Power Management in Cloud Storage Systems

Enterprise and cloud data centers are comprised of tens of thousands of servers providing petabytes of storage to a large number of users and applications. At such a scale, these storage systems face two key challenges: (1) hot-spots due to the dynamic popularity of stored objects; and (2) high operational costs due to power and cooling. Existing storage solutions, however, are unsuitable to address these challenges because of the large number of servers and data objects. This article describes the design, implementation, and evaluation of Ursa, a system that scales to a large number of storage nodes and objects, and aims to minimize latency and bandwidth costs during system reconfiguration. Toward this goal, Ursa formulates an optimization problem that selects a subset of objects from hot-spot servers and performs topology-aware migration to minimize reconfiguration costs. As exact optimization is computationally expensive, we devise scalable approximation techniques for node selection and efficient divide-and-conquer computation. We also show that the same dynamic reconfiguration techniques can be leveraged to reduce power costs by dynamically migrating data off under-utilized nodes, and powering up servers neighboring existing hot-spots to reduce reconfiguration costs. Our evaluation shows that Ursa achieves cost-effective load management, is time-responsive in computing placement decisions (e.g., about two minutes for 10K nodes and 10M objects), and provides power savings of 15%--37%.

[1]  Sudipto Das,et al.  Who's Driving this Cloud? Towards Efficient Migration for Elastic and Autonomic Multitenant Databases , 2010 .

[2]  Xiaoyun Zhu,et al.  Utilization and SLO-Based Control for Dynamic Sizing of Resource Partitions , 2005, DSOM.

[3]  Jeffrey F. Naughton,et al.  On energy management, load balancing and replication , 2010, SGMD.

[4]  Carlo Curino,et al.  Relational Cloud: a Database Service for the cloud , 2011, CIDR.

[5]  Beng Chin Ooi,et al.  Towards elastic transactional cloud storage with range query support , 2010, Proc. VLDB Endow..

[6]  Seung-won Hwang,et al.  Scalable Load Balancing in Cluster Storage Systems , 2011, Middleware.

[7]  Austin Donnelly,et al.  Sierra: a power-proportional, distributed storage system , 2009 .

[8]  Divyakant Agrawal,et al.  Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration , 2011, Proc. VLDB Endow..

[9]  Gregory R. Ganger,et al.  Ursa minor: versatile cluster-based storage , 2005, FAST'05.

[10]  Carlo Curino,et al.  Schism , 2010, Proc. VLDB Endow..

[11]  Eric Anderson,et al.  Quickly finding near-optimal storage designs , 2005, TOCS.

[12]  Arun Venkataramani,et al.  Proceedings of the 5th Symposium on Operating Systems Design and Implementation Tcp Nice: a Mechanism for Background Transfers , 2022 .

[13]  Antony I. T. Rowstron,et al.  Write off-loading: Practical power management for enterprise storage , 2008, TOS.

[14]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[15]  Carlos Maltzahn,et al.  Ceph: a scalable, high-performance distributed file system , 2006, OSDI '06.

[16]  Akshat Verma,et al.  SRCMap: Energy Proportional Storage Using Dynamic Consolidation , 2010, FAST.

[17]  Antony I. T. Rowstron,et al.  Everest: Scaling Down Peak Loads Through I/O Off-Loading , 2008, OSDI.

[18]  Ke Zhou,et al.  A Strategy of Load Balancing in Object Storage System , 2005, The Fifth International Conference on Computer and Information Technology (CIT'05).

[19]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[20]  Lakshmi Ganesh,et al.  Optimizing Power Consumption in Large Scale Storage Systems , 2007, HotOS.

[21]  Frederick S. Hillier,et al.  Introduction of Operations Research , 1967 .

[22]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[23]  Sriram Sankar,et al.  Sensitivity Based Power Management of Enterprise Storage Systems , 2008, 2008 IEEE International Symposium on Modeling, Analysis and Simulation of Computers and Telecommunication Systems.

[24]  GhemawatSanjay,et al.  The Google file system , 2003 .

[25]  Albert G. Greenberg,et al.  VL2: a scalable and flexible data center network , 2009, SIGCOMM '09.

[26]  R. Faure,et al.  Introduction to operations research , 1968 .

[27]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[28]  Akshat Verma,et al.  pMapper: Power and Migration Cost Aware Application Placement in Virtualized Systems , 2008, Middleware.

[29]  Divyakant Agrawal,et al.  Zephyr: live migration in shared nothing databases for elastic cloud platforms , 2011, SIGMOD '11.

[30]  冯海超 Windows Azure:微软押上未来 , 2012 .

[31]  Mehul A. Shah,et al.  Analyzing the energy efficiency of a database server , 2010, SIGMOD Conference.

[32]  Irfan Ahmad,et al.  BASIL: Automated IO Load Balancing Across Storage Devices , 2010, FAST.

[33]  Justin Cappos,et al.  Rhizoma: A Runtime for Self-deploying, Self-managing Overlays , 2009, Middleware.

[34]  Jeffrey S. Chase,et al.  Automated control for elastic storage , 2010, ICAC '10.

[35]  Jiri Schindler,et al.  A load balancing framework for clustered storage systems , 2008, HiPC'08.

[36]  Xiaorui Wang,et al.  Exploring power-performance tradeoffs in database systems , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[37]  Khuzaima Daudjee,et al.  Dynamic database replica provisioning through virtualization , 2010, CloudDB '10.

[38]  Witold Litwin,et al.  Linear Hashing: A new Algorithm for Files and Tables Addressing , 1980, ICOD.