Toward Energy-aware Scheduling Using Machine Learning Energy-efficient Distributed Computing Systems, First Edition. Edited toward Energy-aware Scheduling Using Machine Learning 8.1.1 Energetic Impact of the Cloud 8.1.2 an Intelligent Way to Manage

The cloud and the Web 2.0 have contributed to democratize the Internet, allowing • Q1 • Q2 everybody to share information, services, and IT resources around the network. With the arrival of social networks and the introduction of new IT infrastructures in the business world, the Internet population has grown enough to make the need for computing resources an important matter to be handled. While few years ago enterprises had all their IT infrastructures in privately owned data centers, nowadays the big IT corporations have started a data-center race, offering computing and storage resources at low prices, looking for outside companies to trust them for their data or IT needs. A single web application in the cloud is easily used by people from around the world, so data and computation need to be available from everywhere, having in mind things such as the quality of service (QoS) and the service-level agreements (SLAs) between users and servers. Services offered by Google and YouTube, for example, must be replicated around the globe or just be efficient enough to move data, jobs, or applications among the data-center farms spread along the planet. Given the amount of applications running now on the cloud and the amount that will come, coordinating all its applications, resources, and services becomes by itself a hard optimization problem. Things do not end by having powerful enough data centers in order to serve applications or computation time. As energy-related costs have become a major economical factor for IT infrastructures and data centers, power consumption has become an important element to keep in mind when designing and managing them. This energetic cost is reflected in the electric consumption, which is sometimes not linear with the capacity of that data centers, also with the natural environment and the social pressure. Companies, enterprises dedicated to cloud-based services, and the research community are being challenged to find better and more efficient power-aware resource management strategies. Until now, technological improvement sufficed to cover the increasing IT demand, bringing faster processors, bigger storage devices, and faster connections between resources. The energetic factor was not relevant enough to be focused on. Now, we find that the demand is growing faster than technological improvement, so each time we need bigger data centers in colder places, having enough power supply. All these have a serious impact on the natural environment (1) and also on the economical cost of maintaining …

[1]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[2]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[3]  Thomas Fahringer,et al.  Automatic Performance Prediction of Parallel Programs , 1996, Springer US.

[4]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[5]  Kagan Tumer,et al.  Collective Intelligence for Control of Distributed Dynamical Systems , 1999, ArXiv.

[6]  Carla E. Brodley,et al.  Predictive application-performance modeling in a computational grid environment , 1999, Proceedings. The Eighth International Symposium on High Performance Distributed Computing (Cat. No.99TH8469).

[7]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[8]  Enrique V. Carrera,et al.  Load balancing and unbalancing for power and performance in cluster-based systems , 2001 .

[9]  Daniel A. Reed,et al.  Performance Contracts: Predicting and Monitoring Grid Application Behavior , 2001, GRID.

[10]  Michael C. Fairhurst,et al.  Genetic Algorithms for Multi-classifier System Configuration: A Case Study in Character Recognition , 2001, Multiple Classifier Systems.

[11]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[12]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[13]  Lindsay I. Smith,et al.  A tutorial on Principal Components Analysis , 2002 .

[14]  E. N. Elnozahy,et al.  Energy-Efficient Server Clusters , 2002, PACS.

[15]  Fuad Rahman,et al.  Novel approaches to optimized self-configuration in high performance multiple-expert classifiers , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[16]  Volkmar Sieh,et al.  Implementing a User-Mode Linux with Minimal Changes from Original Kernel , 2002 .

[17]  HarrisTim,et al.  Xen and the art of virtualization , 2003 .

[18]  Kevin Skadron,et al.  Power-aware QoS management in Web servers , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[19]  Adam Arbree,et al.  Mapping Abstract Complex Workflows onto Grid Environments , 2003, Journal of Grid Computing.

[20]  Michael L. Littman,et al.  Reinforcement learning for autonomic network repair , 2004, International Conference on Autonomic Computing, 2004. Proceedings..

[21]  Michael I. Jordan,et al.  Failure diagnosis using decision trees , 2004 .

[22]  Yolanda Gil,et al.  Artificial intelligence and grids: workflow planning and beyond , 2004, IEEE Intelligent Systems.

[23]  Anand Sivasubramaniam,et al.  Managing server energy and operational costs in hosting centers , 2005, SIGMETRICS '05.

[24]  Michael Dahlin,et al.  Towards Self-Configuring Hardware for Distributed Computer Systems , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[25]  Jean-Louis Sourrouille,et al.  A middleware for autonomic QoS management based on learning , 2005, SEM '05.

[26]  David Vengerov,et al.  A Reinforcement Learning Framework for Dynamic Resource Allocation: First Results. , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[27]  Nicholas R. Jennings,et al.  The Semantic Grid: Past, Present, and Future , 2005, Proceedings of the IEEE.

[28]  Mark Crovella,et al.  Mining anomalies using traffic feature distributions , 2005, SIGCOMM '05.

[29]  Fabrice Bellard,et al.  QEMU, a Fast and Portable Dynamic Translator , 2005, USENIX ATC, FREENIX Track.

[30]  Rajarshi Das,et al.  On the use of hybrid reinforcement learning for autonomic resource allocation , 2007, Cluster Computing.

[31]  Thomas Fahringer,et al.  Grid Application Fault Diagnosis Using Wrapper Services and Machine Learning , 2007, Int. J. Cooperative Inf. Syst..

[32]  Jordi Torres,et al.  Web Customer Modeling for Automated Session Prioritization on High Traffic Sites , 2007, User Modeling.

[33]  Michael Dahlin,et al.  Machine Learning for On-Line Hardware Reconfiguration , 2007, IJCAI.

[34]  Zhiling Lan,et al.  Anomaly localization in large-scale clusters , 2007, 2007 IEEE International Conference on Cluster Computing.

[35]  Ripal Nathuji,et al.  Exploiting Platform Heterogeneity for Power Efficient Data Centers , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[36]  David Vengerov,et al.  A reinforcement learning framework for online data migration in hierarchical storage systems , 2007, The Journal of Supercomputing.

[37]  David Levine,et al.  Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning , 2007, NIPS.

[38]  Mathieu Jan,et al.  H IPCAL : State of the Art of OS and Network virtualization solutions for Grids , 2007 .

[39]  Peter Stone,et al.  Autonomous Return on Investment Analysis of Additional Processing Resources , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[40]  Rajarshi Das,et al.  Coordinating Multiple Autonomic Managers to Achieve Specified Power-Performance Tradeoffs , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[41]  Jordi Torres,et al.  Towards Self-adaptable Monitoring Framework for Self-healing , 2008, CoreGRID Workshop on Grid Middleware.

[42]  Qi Zhang,et al.  A regression-based analytic model for capacity planning of multi-tier applications , 2008, Cluster Computing.

[43]  Werner Vogels,et al.  Beyond Server Consolidation , 2008, ACM Queue.

[44]  Alexandru Iosup,et al.  The Grid Workloads Archive , 2008, Future Gener. Comput. Syst..

[45]  Balázs Kégl,et al.  Utility-Based Reinforcement Learning for Reactive Grids , 2008, 2008 International Conference on Autonomic Computing.

[46]  Jordi Torres,et al.  Self-adaptive utility-based web session management , 2009, Comput. Networks.

[47]  Daniel Mossé,et al.  A dynamic configuration model for power-efficient virtualized server clusters , 2009 .

[48]  Wei Liu,et al.  Adaptive power management using reinforcement learning , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[49]  R. Gavaldà,et al.  Predicting web application crashes using machine learning , 2009 .

[50]  Liang Liu,et al.  GreenCloud: a new architecture for green data center , 2009, ICAC-INDST '09.

[51]  Karsten Schwan,et al.  VPM tokens: virtual machine-aware power budgeting in datacenters , 2009, Cluster Computing.

[52]  Jordi Torres,et al.  Introducing Virtual Execution Environments for Application Lifecycle Management and SLA-Driven Resource Distribution within Service Providers , 2009, 2009 Eighth IEEE International Symposium on Network Computing and Applications.

[53]  Jordi Torres,et al.  Predicting Web Server Crashes: A Case Study in Comparing Prediction Algorithms , 2009, 2009 Fifth International Conference on Autonomic and Autonomous Systems.

[54]  Jordi Torres,et al.  Autonomic QoS control in enterprise Grid environments using online simulation , 2009, J. Syst. Softw..

[55]  Jordi Guitart,et al.  SLA-driven Elastic Cloud Hosting Provider , 2010, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing.

[56]  Jordi Torres,et al.  Exploiting semantics and virtualization for SLA‐driven resource allocation in service providers , 2010, Concurr. Comput. Pract. Exp..

[57]  Jordi Torres,et al.  Energy-Aware Scheduling in Virtualized Datacenters , 2010, 2010 IEEE International Conference on Cluster Computing.

[58]  Lachlan L. H. Andrew,et al.  Optimal sleep patterns for serving delay-tolerant jobs , 2010, e-Energy.

[59]  Jordi Torres,et al.  Characterizing Cloud Federation for Enhancing Providers' Profit , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[60]  Jordi Torres,et al.  Multifaceted resource management for dealing with heterogeneous workloads in virtualized data centers , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[61]  Jordi Torres,et al.  Towards energy-aware scheduling in data centers using machine learning , 2010, e-Energy.

[62]  Randy H. Katz,et al.  An energy case for hybrid datacenters , 2010, OPSR.