DVM: A Big Virtual Machine for Cloud Computing

As cloud-based computation grows to be an increasingly important paradigm, providing a general computational interface to support datacenter-scale programming has become an imperative research agenda. Many cloud systems use existing virtual machine monitor (VMM) technologies, such as Xen, VMware, and Windows Hypervisor, to multiplex a physical host into multiple virtual hosts and isolate computation on the shared cluster platform. However, traditional multiplexing VMMs do not scale beyond one single physical host, and it alone cannot provide the programming interface and cluster-wide computation that a datacenter system requires. We design a new instruction set architecture, DISA, to unify myriads of compute nodes to form a big virtual machine called DVM and present programmers the view of a single computer, where thousands of tasks run concurrently in a large, unified, and snapshotted memory space. The DVM provides a simple yet scalable programming model and mitigates the scalability bottleneck of traditional distributed shared memory systems. Along with an efficient execution engine, the capacity of a DVM can scale up to support large clusters. We have implemented and tested DVM on four platforms, and our evaluation shows that DVM has excellent performance and scalability. On one physical host, the system overhead of DVM is comparable to that of traditional VMMs. On 16 physical hosts, the DVM runs 10 times faster than MapReduce/Hadoop and X10. On 160 compute nodes in the TH-1/GZ supercomputer, the DVM delivers a 12.99× speedup over the computation on 10 compute nodes. The implementation of DVM also allows it to run above traditional VMMs, and we verify that DVM shows linear speedup on a parallelizable workload on 256 large EC2 instances.

[1]  Bradford L. Chamberlain,et al.  Parallel Programmability and the Chapel Language , 2007, Int. J. High Perform. Comput. Appl..

[2]  Alan L. Cox,et al.  TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems , 1994, USENIX Winter.

[3]  Zhiqiang Ma,et al.  DVM: towards a datacenter-scale virtual machine , 2012, VEE '12.

[4]  Geoffrey C. Fox,et al.  MapReduce for Data Intensive Scientific Analyses , 2008, 2008 IEEE Fourth International Conference on eScience.

[5]  Chau-Wen Tseng,et al.  Compiler optimizations for eliminating barrier synchronization , 1995, PPOPP '95.

[6]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[7]  Joseph M. Hellerstein,et al.  MapReduce Online , 2010, NSDI.

[8]  Robbert van Renesse,et al.  Toward a cloud computing research agenda , 2009, SIGA.

[9]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[10]  C. H. Flood,et al.  The Fortress Language Specification , 2007 .

[11]  Carl A. Waldspurger,et al.  Memory resource management in VMware ESX server , 2002, OSDI '02.

[12]  Rob Pike,et al.  Interpreting the data: Parallel analysis with Sawzall , 2005, Sci. Program..

[13]  Maurice Herlihy,et al.  Transactional Memory: Architectural Support For Lock-free Data Structures , 1993, Proceedings of the 20th Annual International Symposium on Computer Architecture.

[14]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[15]  Gernot Heiser,et al.  vNUMA: A Virtual Shared-Memory Multiprocessor , 2009, USENIX Annual Technical Conference.

[16]  Douglas Stott Parker,et al.  Map-reduce-merge: simplified relational data processing on large clusters , 2007, SIGMOD '07.

[17]  Vivek Sarkar,et al.  X10: an object-oriented approach to non-uniform cluster computing , 2005, OOPSLA '05.

[18]  Bill Nitzberg,et al.  Distributed shared memory: a survey of issues and algorithms , 1991, Computer.

[19]  Jim Gray,et al.  A critique of ANSI SQL isolation levels , 1995, SIGMOD '95.

[20]  Alan L. Cox,et al.  Message passing versus distributed shared memory on networks of workstations , 1995 .

[21]  Rong Zhang,et al.  A large scale clustering scheme for kernel K-Means , 2002, Object recognition supported by user interaction for service robots.

[22]  Kai Lu,et al.  The TianHe-1A Supercomputer: Its Hardware and Software , 2011, Journal of Computer Science and Technology.

[23]  Hai Jin,et al.  Single System Image , 2001, Int. J. High Perform. Comput. Appl..

[24]  Zhiqiang Ma,et al.  The Limitation of MapReduce: A Probing Case and a Lightweight Solution , 2010 .

[25]  Michael Isard,et al.  DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language , 2008, OSDI.

[26]  Christoforos E. Kozyrakis,et al.  Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.

[27]  Sarita V. Adve,et al.  Shared Memory Consistency Models: A Tutorial , 1996, Computer.

[28]  Richard Wolski,et al.  The Eucalyptus Open-Source Cloud-Computing System , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[29]  冯海超 Windows Azure:微软押上未来 , 2012 .

[30]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[31]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[32]  GhemawatSanjay,et al.  The Google file system , 2003 .

[33]  A. Kivity,et al.  kvm : the Linux Virtual Machine Monitor , 2007 .

[34]  Daeryong Lee,et al.  Modified K-means algorithm for vector quantizer design , 1997, IEEE Signal Processing Letters.

[35]  Yuan Yu,et al.  Dryad: distributed data-parallel programs from sequential building blocks , 2007, EuroSys '07.

[36]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[37]  Paul Hudak,et al.  Memory coherence in shared virtual memory systems , 1989, TOCS.

[38]  Naga K. Govindaraju,et al.  Mars: A MapReduce Framework on graphics processors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[39]  Pen-Chung Yew,et al.  The impact of synchronization and granularity on parallel systems , 1990, ISCA '90.

[40]  Myung Jin Bae,et al.  An Improvement of Modified K-Means Algorithm for Vector Quantizer Design , 1997 .