Towards Superclouds

Cloud computing has emerged as an economically attractive utility model for computational resources. An increasing number of industries, from businesses to governments, are embracing cloud computing. However, the economic benefits of the cloud computing model come at a price: the loss of control over how services and applications can use computing resources. In other words, cloud computing is fundamentally provider-centric. Cloud providers (the providers of computational resources), not cloud users (the consumers of computational resources), dictate rules and policies governing how computational resources can be used. For large enterprise application workloads, adhering to cloud provider rules and policies may be prohibitive. This dissertation explores the question: how can large enterprise workloads efficiently utilize and control computational resources from a variety of providers in the cloud computing model? The main contributions of this dissertation relate to a fundamental change to the cloud computing model. Instead of a provider-centric model, we propose a user-centric model in which the cloud user can maintain control over how computational resources obtained from cloud providers can be used. We have devised a new abstraction, called cloud extensibility , to enable the implementation of provider-level functionality by cloud users. Leveraging cloud extensibility, we describe steps towards a user-centric cloud computing model that grants cloud users—including large enterprises—control over resources obtained from one or more cloud providers. We call this new model the supercloud model. More specifically, we focus on three key areas in which current provider-centric cloud computing models do not expose the necessary control or lack the features to support large enterprise workloads without significant reconfiguration effort. First, clouds are not interoperable, restricting workloads to a single provider and hindering incremental migration to the cloud. Second, clouds lack support for complex enterprise network configurations, including flow policies between application components and low-level network features (e.g., IP addresses, multicast, VLANs). Finally, high utilization of cloud resources cannot be applied through techniques like oversubscription, and existing techniques do not apply well to common workload patterns. We subsequently make three contributions, embodied in the design, implementation and evaluation of three systems that leverage cloud extensibility. Cloud extensibility itself is instantiated in the first system, a nested virtualization layer called the Xen-Blanket. The Xen-Blanket additionally enables cloud interoperability by homogenizing existing cloud interfaces and services. The second system, VirtualWire, provides a virtual network abstraction to support complex enterprise networks in which the cloud user manages the network control logic. Finally, we present Overdriver , a system that enables high resource utilization through memory oversubscription and the handling of the resulting—often transient and unpredictable—memory overload. Together, these three systems are important steps towards superclouds.

[1]  Martín Casado,et al.  NOX: towards an operating system for networks , 2008, CCRV.

[2]  Jill Jonnes,et al.  Empires of Light: Edison, Tesla, Westinghouse, and the Race to Electrify the World , 2003 .

[3]  Steven Hand,et al.  Satori: Enlightened Page Sharing , 2009, USENIX Annual Technical Conference.

[4]  Scott Devine,et al.  Disco: running commodity operating systems on scalable multiprocessors , 1997, TOCS.

[5]  Gerald J. Popek,et al.  Formal requirements for virtualizable third generation architectures , 1974, SOSP '73.

[6]  Alan L. Cox,et al.  Practical, transparent operating system support for superpages , 2002, OPSR.

[7]  Samuel T. King,et al.  ReVirt: enabling intrusion analysis through virtual-machine logging and replay , 2002, OPSR.

[8]  Brian N. Bershad,et al.  Extensibility safety and performance in the SPIN operating system , 1995, SOSP.

[9]  Gil Neiger,et al.  IntelŴVirtualization Technology: Hardware Support for Efficient Processor Virtualization , 2006 .

[10]  Marianne Shaw,et al.  Scale and performance in the Denali isolation kernel , 2002, OSDI '02.

[11]  Haibo Chen,et al.  CloudVisor: retrofitting protection of virtual machines in multi-tenant cloud with nested virtualization , 2011, SOSP.

[12]  Xavier Lorca,et al.  Entropy: a consolidation manager for clusters , 2009, VEE '09.

[13]  Amin Vahdat,et al.  A scalable, commodity data center network architecture , 2008, SIGCOMM '08.

[14]  Martín Casado,et al.  Extending Networking into the Virtualization Layer , 2009, HotNets.

[15]  EDDIE KOHLER,et al.  The click modular router , 2000, TOCS.

[16]  Srinivasan Seshan,et al.  Improving TCP/IP performance over wireless networks , 1995, MobiCom '95.

[17]  Gautam Kar,et al.  Application Performance Management in Virtualized Server Environments , 2006, 2006 IEEE/IFIP Network Operations and Management Symposium NOMS 2006.

[18]  Hakim Weatherspoon,et al.  RACS: a case for cloud storage diversity , 2010, SoCC '10.

[19]  Michael Walfish,et al.  Middleboxes No Longer Considered Harmful , 2004, OSDI.

[20]  Andrew Warfield,et al.  Facilitating the Development of Soft Devices , 2005, USENIX Annual Technical Conference, General Track.

[21]  Michele Colajanni,et al.  Dynamic Load Management of Virtual Machines in Cloud Architectures , 2009, CloudComp.

[22]  Beng-Hong Lim,et al.  Fast Transparent Migration for Virtual Machines , 2005, USENIX Annual Technical Conference, General Track.

[23]  Stephen J. Nadas,et al.  Virtual Router Redundancy Protocol (VRRP) Version 3 for IPv4 and IPv6 , 2010, RFC.

[24]  Brian E. Carpenter,et al.  Middleboxes: Taxonomy and Issues , 2002, RFC.

[25]  Monica S. Lam,et al.  Optimizing the migration of virtual computers , 2002, OPSR.

[26]  Andrew Warfield,et al.  Xen and the art of virtualization , 2003, SOSP '03.

[27]  Andrew Warfield,et al.  SecondSite: disaster tolerance as a service , 2012, VEE '12.

[28]  Anja Feldmann,et al.  Live wide-area migration of virtual machines including local persistent state , 2007, VEE '07.

[29]  Wendong Hu,et al.  NetBench: a benchmarking suite for network processors , 2001, IEEE/ACM International Conference on Computer Aided Design. ICCAD 2001. IEEE/ACM Digest of Technical Papers (Cat. No.01CH37281).

[30]  Anna R. Karlin,et al.  Implementing global memory management in a workstation cluster , 1995, SOSP.

[31]  Alan L. Cox,et al.  Optimizing network virtualization in Xen , 2006 .

[32]  Margo I. Seltzer,et al.  Dealing with disaster: surviving misbehaved kernel extensions , 1996, OSDI '96.

[33]  George Varghese,et al.  Difference engine , 2010, OSDI.

[34]  R. M. Fano,et al.  The MAC system: the computer utility approach , 1965, IEEE Spectrum.

[35]  Prashant J. Shenoy,et al.  CloudNet: dynamic pooling of cloud resources by live WAN migration of virtual machines , 2011, VEE.

[36]  E. Michael Maximilien,et al.  IBM altocumulus: a cross-cloud middleware and platform , 2009, OOPSLA Companion.

[37]  Mahadev Satyanarayanan,et al.  Internet suspend/resume , 2002, Proceedings Fourth IEEE Workshop on Mobile Computing Systems and Applications.

[38]  Mike Hibler,et al.  An integrated experimental environment for distributed systems and networks , 2002, OPSR.

[39]  Bert Wijnen,et al.  An Architecture for Describing Simple Network Management Protocol (SNMP) Management Frameworks , 2002, RFC.

[40]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[41]  Robert P. Goldberg,et al.  Survey of virtual machine research , 1974, Computer.

[42]  Kuzman Ganchev,et al.  Nswap: A Network Swapping Module for Linux Clusters , 2003, Euro-Par.

[43]  Robert J. Creasy,et al.  The Origin of the VM/370 Time-Sharing System , 1981, IBM J. Res. Dev..

[44]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[45]  Brian Walters,et al.  VMware Virtual Platform , 1999 .

[46]  Anant Agarwal,et al.  An operating system for multicore and clouds: mechanisms and implementation , 2010, SoCC '10.

[47]  Dawson R. Engler,et al.  Exokernel: an operating system architecture for application-level resource management , 1995, SOSP.

[48]  Jennifer Rexford,et al.  Floodless in seattle: a scalable ethernet architecture for large enterprises , 2008, SIGCOMM '08.

[49]  Nick McKeown,et al.  OpenFlow: enabling innovation in campus networks , 2008, CCRV.

[50]  Carl Staelin,et al.  lmbench: Portable Tools for Performance Analysis , 1996, USENIX Annual Technical Conference.

[51]  Willy Zwaenepoel,et al.  Performance and scalability of EJB applications , 2002, OOPSLA '02.

[52]  Eyal de Lara,et al.  SnowFlock: rapid virtual machine cloning for cloud computing , 2009, EuroSys '09.

[53]  Alexander Stage,et al.  Network-aware migration control and scheduling of differentiated virtual machine workloads , 2009, 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing.

[54]  David E. Culler,et al.  The ganglia distributed monitoring system: design, implementation, and experience , 2004, Parallel Comput..

[55]  Martin Roesch,et al.  Snort - Lightweight Intrusion Detection for Networks , 1999 .

[56]  Thomas Schaaf,et al.  CMDB - Yet Another MIB? On Reusing Management Model Concepts in ITIL Configuration Management , 2006, DSOM.

[57]  Craig E. Wills,et al.  Agility in Virtualized Utility Computing , 2007, Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing (VTDC '07).

[58]  Hakim Weatherspoon,et al.  Overdriver: handling memory overload in an oversubscribed cloud , 2011, VEE '11.

[59]  Ole Agesen,et al.  A comparison of software and hardware techniques for x86 virtualization , 2006, ASPLOS XII.

[60]  Fred Joseph Gruenberger Computers and communications : toward a computer utility , 1968 .

[61]  Mendel Rosenblum,et al.  Cellular disco: resource management using virtual clusters on shared-memory multiprocessors , 2000, TOCS.

[62]  Kartik Gopalan,et al.  Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning , 2009, VEE '09.

[63]  Chandra Krintz,et al.  AppScale: Scalable and Open AppEngine Application Development and Deployment , 2009, CloudComp.

[64]  B. Mitchell,et al.  Reliability algorithms for network swapping systems with page migration , 2004, 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No.04EX935).

[65]  Steven P. Weber,et al.  A new synthetic web server trace generation methodology , 2003, 2003 IEEE International Symposium on Performance Analysis of Systems and Software. ISPASS 2003..

[66]  Benny Rochwerger,et al.  Reservoir - When One Cloud Is Not Enough , 2011, Computer.

[67]  Peter Desnoyers,et al.  Memory buddies: exploiting page sharing for smart colocation in virtualized data centers , 2009, VEE '09.