Unifying Cloud Management: Towards Overall Governance of Business Level Objectives

We address the challenge of providing unified cloud resource management towards an overall business level objective, given the multitude of managerial tasks to be performed and the complexity of any architecture to support them. Resource level management tasks include elasticity control, virtual machine and data placement, autonomous fault management, etc, which are intrinsically difficult problems since services normally have unknown lifetime and capacity demands that varies largely over time. To unify the management of these problems, (for optimization with respect to some higher level business level objective, like optimizing revenue while breaking no more than a certain percentage of service level agreements)becomes even more challenging as the resource level managerial challenges are far from independent. After providing the general problem formulation, we review recent approaches taken by the research community, including mainly general autonomic computing technology for large-scale environments and resource level management tools equipped with some business oriented or otherwise qualitative features. We propose and illustrate a policy-driven approach where a high-level management system monitors overall system and services behavior and adjusts lower level policies (e.g., thresholds for admission control, elasticity control, server consolidation level, etc) for optimization towards the measurable business level objectives.

[1]  Dirk Neumann,et al.  Economically Enhanced Resource Management for Internet Service Utilities , 2007, WISE.

[2]  John Darlington,et al.  GridEcon - The Economic-Enhanced Next-Generation Internet , 2007, GECON.

[3]  Layuan Li,et al.  A distributed decomposition policy for computational grid resource allocation optimization based on utility functions , 2005, Microprocess. Microsystems.

[4]  Marin Litoiu,et al.  Optimizing resources in cloud, a SOA governance view , 2010, GTIP '10.

[5]  Dirk Neumann,et al.  SORMA - Building an Open Grid Market for Grid Resource Allocation , 2007, GECON.

[6]  Balázs Kégl,et al.  Multi-objective Reinforcement Learning for Responsive Grids , 2010, Journal of Grid Computing.

[7]  Yun Chi,et al.  SLA-Aware Profit Optimization in Cloud Services via Resource Scheduling , 2010, 2010 6th World Congress on Services.

[8]  Marin Litoiu,et al.  Fast scalable optimization to configure service systems having cost and quality of service constraints , 2009, ICAC '09.

[9]  David Sinreich,et al.  An architectural blueprint for autonomic computing , 2006 .

[10]  Steve R. White,et al.  Unity: experiences with a prototype autonomic computing system , 2004 .

[11]  Norman W. Paton,et al.  Optimizing Utility in Cloud Computing through Autonomic Workload Execution , 2009 .

[12]  Adir Even,et al.  Making money with clouds: Revenue optimization through automated policy decisions , 2009 .

[13]  Erik Elmroth,et al.  Design and evaluation of a decentralized system for grid-wide fairshare scheduling , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[14]  Rajarshi Das,et al.  Utility-based collaboration among autonomous agents for resource allocation in data centers , 2006, AAMAS '06.

[15]  Rajkumar Buyya,et al.  Adapting Market-Oriented Scheduling Policies for Cloud Computing , 2010, ICA3PP.

[16]  Balázs Kégl,et al.  Utility-Based Reinforcement Learning for Reactive Grids , 2008, 2008 International Conference on Autonomic Computing.

[17]  Benoit Hudzia,et al.  Future Generation Computer Systems Optimis: a Holistic Approach to Cloud Service Provisioning , 2022 .

[18]  Michael Anthony Bauer,et al.  Adapting to Run-Time Changes in Policies Driving Autonomic Management , 2008, Fourth International Conference on Autonomic and Autonomous Systems (ICAS'08).

[19]  Luís Veiga,et al.  Heuristic for resources allocation on utility computing infrastructures , 2008, MGC '08.

[20]  Yue Zhang,et al.  Toward automatic policy refinement in repair services for large distributed systems , 2010, OPSR.

[21]  Marin Litoiu,et al.  A business driven cloud optimization architecture , 2010, SAC '10.

[22]  Schahram Dustdar,et al.  Low level Metrics to High level SLAs - LoM2HiS framework: Bridging the gap between monitored metrics and SLA parameters in cloud environments , 2010, 2010 International Conference on High Performance Computing & Simulation.

[23]  Jörn Altmann,et al.  GridEcon: A Market Place for Computing Resources , 2008, GECON.

[24]  Rajarshi Das,et al.  On the use of hybrid reinforcement learning for autonomic resource allocation , 2007, Cluster Computing.

[25]  Schahram Dustdar,et al.  Towards Knowledge Management in Self-Adaptable Clouds , 2010, 2010 6th World Congress on Services.

[26]  Marin Litoiu,et al.  Performance model driven QoS guarantees and optimization in clouds , 2009, 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing.