Improving Software Dependability Using System-Level Virtualization: A Survey

This paper investigates different uses of system-level virtualization to improve dependability of software systems. Notable works are studied and presented in two categories, namely, the attributes of dependability and the means to attain those attributes. The works that have been done in improving different attributes of dependability including availability, reliability, safety, integrity and also fault tolerance as a mean to attain these attributes are studied. In addition, some notable researches in this area such as device driver separation, virtualization-based replication, and using VMMs to check the running guest operating systems are discussed.

[1]  Dinakar Dhurjati,et al.  Secure virtual architecture: a safe execution environment for commodity operating systems , 2007, SOSP.

[2]  Xuxian Jiang,et al.  Towards a VMM-based usage control framework for OS kernel integrity protection , 2007, SACMAT '07.

[4]  Haibo Chen,et al.  Live updating operating systems using virtualization , 2006, VEE '06.

[5]  Andrew Warfield,et al.  Safe Hardware Access with the Xen Virtual Machine Monitor , 2007 .

[6]  Hai Jin,et al.  ADVE: Adaptive and Dependable Virtual Environments for Grid Computing , 2008, GPC.

[7]  Tobias Distler,et al.  Efficient state transfer for hypervisor-based proactive recovery , 2008, WRAITS '08.

[8]  Peter M. Chen,et al.  Execution replay of multiprocessor virtual machines , 2008, VEE '08.

[9]  Christian Engelmann,et al.  Proactive fault tolerance for HPC with Xen virtualization , 2007, ICS '07.

[10]  Jeffrey P. Buzen,et al.  A note on virtual machines and software reliability , 1973 .

[11]  Jong Sou Park,et al.  Improving Fault Tolerance by Virtualization and Software Rejuvenation , 2008, 2008 Second Asia International Conference on Modelling & Simulation (AMS).

[12]  Adrian Perrig,et al.  SecVisor: a tiny hypervisor to provide lifetime kernel code integrity for commodity OSes , 2007, SOSP.

[13]  James E. Smith,et al.  Virtual machines - versatile platforms for systems and processes , 2005 .

[14]  John Paul Walters,et al.  A fault-tolerant strategy for virtualized HPC clusters , 2009, The Journal of Supercomputing.

[15]  Jordi Torres,et al.  High-available grid services through the use of virtualized clustering , 2007, 2007 8th IEEE/ACM International Conference on Grid Computing.

[16]  Dutch T. Meyer,et al.  Remus: High Availability via Asynchronous Virtual Machine Replication. (Best Paper) , 2008, NSDI.

[17]  Leslie Lamport,et al.  The Byzantine Generals Problem , 1982, TOPL.

[18]  Rüdiger Kapitza,et al.  Hypervisor-Based Efficient Proactive Recovery , 2007, 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007).

[19]  Stefan Götz,et al.  Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines , 2004, OSDI.

[20]  Scott M. Baker,et al.  Server virtualization , 2005 .

[21]  Bernhard Jansen,et al.  Architecting Dependable and Secure Systems Using Virtualization , 2007, WADS.

[22]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[23]  Fred B. Schneider,et al.  Hypervisor-based fault tolerance , 1996, TOCS.

[24]  Robert P. Goldberg,et al.  Survey of virtual machine research , 1974, Computer.

[25]  Christian Engelmann,et al.  A Framework for Proactive Fault Tolerance , 2008, 2008 Third International Conference on Availability, Reliability and Security.

[26]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[27]  Michael W. Hicks,et al.  Automated detection of persistent kernel control-flow attacks , 2007, CCS '07.

[28]  Jong Sou Park,et al.  Availability Analysis of Application Servers Using Software Rejuvenation and Virtualization , 2009, Journal of Computer Science and Technology.

[29]  Scott Shenker,et al.  Diverse Replication for Single-Machine Byzantine-Fault Tolerance , 2008, USENIX Annual Technical Conference.

[30]  Jong Sou Park,et al.  A Recovery Model for Survivable Distributed Systems through the Use of Virtualization , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.