The design of the Inferno virtual machine

Virtual machines are an important component of modern portable environments such as Inferno and Java because they provide an architecture-independent representation of executable code. Their performance is critical to the success of such environments, but they are difficult to design well because they are subject to conflicting goals. On the one hand, they offer a way to hide the differences between instruction architectures; on the other, they must be implemented efficiently on a variety of underlying machines. A comparison of the engineering and evolution of the Inferno and Java virtual machines provides insight into the tradeoffs in their design and implementation. We argue that the design of virtual machines should be rooted in the nature of modern processors, not language interpreters, with an eye towards on-the-fly compilation rather than interpretation or specialpurpose silicon. Dis, the Inferno Virtual Machine In early 1995, we set out to apply the ideas of the Plan 9 operating system [1] to a wider range of devices and networks. The resulting system, Inferno [2], is a small operating system and execution environment that supports application portability across a wide variety of processors and operating systems. Unaware of the contemporary work to establish Java [3] from the technology of the Oak project, we independently concluded that a virtual machine (VM) was a necessary component of such a system [4]. Because of improvements in processor speed and the feasibility of on-the-fly compilers, a VM can execute quickly enough to be economically viable. The Inferno virtual machine, called Dis, has several unusual aspects to its design: the instruction set, the module system, and the garbage collector. The Dis instruction set provides a close match to the architecture of existing processors. Instructions are of the form OP src1, src2, dst The src1 and dst operands specify general addresses or arbitrary-sized constants, while the src2 operand is restricted to smaller constants and stack offsets to reduce code space. Each operand specifies an address either in the stack frame of the executing procedure or in the global data of its module. The types of operands are set by the instructions. Basic types are word (32-bit signed), big (64-bit signed), byte (8-bit unsigned), real (64-bit IEEE floating point), and pointer (implementation-dependent). The instruction set follows the example of CISC processors, providing three-operand memory-to-memory operations for arithmetic, data motion, and so on. It also has instructions to allocate memory, to load modules, and to create, synchronize, and communicate between processes. A module is the unit of dynamically loaded code and data. Modules are loaded by a VM instruction that returns a pointer to a method table for the module. That pointer is managed by the VM’s garbage collector, so code and data for the module are garbage collected like any other memory. Type safety is preserved by checking method types at module load time using an MD5 signature of the type. Memory management is intimately tied to the instruction set of the VM. Dis uses a hybrid garbage collection scheme: most garbage is collected by simple reference counting, while a real-time coloring collector gathers cyclic data. Because reference counting is an exact rather than conservative form of garbage collec

[1]  David R. Ditzel,et al.  Register allocation for free: The C machine stack cache , 1982, ASPLOS I.

[2]  Kathleen Jensen,et al.  Pascal-P Implementation Notes , 1981, Pascal - The Language and its Implementation.

[3]  Ken Thompson,et al.  Plan 9 from Bell Labs , 1995 .

[4]  Ken Arnold,et al.  The Java Programming Language , 1996 .