Distributed file system support for virtual machines in grid computing

This paper presents a data management solution which allows fast virtual machine (VM) instantiation and efficient run-time execution to support VMs as execution environments in grid computing. It is based on novel distributed file system virtualization techniques and is unique in that: 1) it provides on-demand access to VM state for unmodified VM monitors; 2) it supports user-level and write-back disk caches, per-application caching policies and middleware-driven consistency models; and 3) it supports the use of meta-data associated with files to expedite data transfers. The paper reports on its performance in a WAN setup using VMware-based VMs. Results show that the solution delivers performance over 30% better than native NFS and can bring application-perceived overheads below 10% relatively to a local disk setup. The solution also allows a VM with 1.6GB virtual disk and 320MB virtual memory to be cloned within 160 seconds when it is first instantiated (and within 25 seconds for subsequent clones).

[1]  Carl Kesselman,et al.  GriPhyN and LIGO, building a virtual data Grid for gravitational wave scientists , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[2]  Monica S. Lam,et al.  Optimizing the migration of virtual computers , 2002, OPSR.

[3]  Renato J. O. Figueiredo,et al.  The PUNCH virtual file system: seamless access to decentralized storage services in a computational grid , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[4]  Andrew S. Grimshaw,et al.  Grid-based file access: the Legion I/O model , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[5]  Renato J. O. Figueiredo,et al.  VP/GFS: an Architecture for Virtual Private Grid File Systems , 2003 .

[6]  Ian T. Foster,et al.  GASS: a data movement and access service for wide area computing systems , 1999, IOPADS '99.

[7]  Ian T. Foster,et al.  The anatomy of the grid: enabling scalable virtual organizations , 2001, Proceedings First IEEE/ACM International Symposium on Cluster Computing and the Grid.

[8]  Ian T. Foster,et al.  Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing , 2001, 2001 Eighteenth IEEE Symposium on Mass Storage Systems and Technologies.

[9]  José A. B. Fortes,et al.  PUNCH: An architecture for Web-enabled wide-area network-computing , 2004, Cluster Computing.

[10]  Carl Smith,et al.  NFS Version 3: Design and Implementation , 1994, USENIX Summer.

[11]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[12]  Brent Callaghan,et al.  NFS Illustrated , 1999 .

[13]  Renato J. O. Figueiredo,et al.  Single sign-on in In-VIGO: role-based access via delegation mechanisms using short-lived user identities , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[14]  Renato J. O. Figueiredo,et al.  Grid-computing portals and security issues , 2003, J. Parallel Distributed Comput..

[15]  Xiaomin Zhu,et al.  From virtualized resources to virtual computing grids: the In-VIGO system , 2005, Future Gener. Comput. Syst..

[16]  Douglas Thain,et al.  The Kangaroo approach to data movement on the Grid , 2001, Proceedings 10th IEEE International Symposium on High Performance Distributed Computing.

[17]  R. P. Goldberg,et al.  Virtual Machine Technology: A Bridge From Large Mainframes To Networks Of Small Computers , 1979 .

[18]  Renato J. O. Figueiredo,et al.  Enhancing the scalability and usability of computational grids via logical user accounts and virtual file systems , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[19]  Jeff Dike,et al.  A user-mode port of the Linux kernel , 2000, Annual Linux Showcase & Conference.

[20]  Miron Livny,et al.  Condor-a hunter of idle workstations , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[21]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1988, TOCS.

[22]  Andrea C. Arpaci-Dusseau,et al.  Flexibility, manageability, and performance in a Grid storage appliance , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[23]  Robert P. Goldberg,et al.  Survey of virtual machine research , 1974, Computer.

[24]  David Robinson,et al.  NFS version 4 Protocol , 2000, RFC.

[25]  Mahadev Satyanarayanan,et al.  Scale and performance in a distributed file system , 1987, SOSP '87.

[26]  Beng-Hong Lim,et al.  Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor , 2001, USENIX Annual Technical Conference, General Track.

[27]  David Mazières,et al.  Separating key management from file system security , 1999, SOSP.

[28]  Renato J. O. Figueiredo,et al.  A case for grid computing on virtual machines , 2003, 23rd International Conference on Distributed Computing Systems, 2003. Proceedings..