The Behavior of Efficient Virtual Machine Interpreters on Modern Architectures

Romer et al (ASPLOS 96) examined several interpreters and concluded that they behave much like general purpose integer programs such as gcc. We show that there is an important class of interpreters which behave very differently. Efficient virtual machine interpreters perform a large number of indirect branches (3.2%-13% of all executed instructions in our benchmarks, taking up to 61%-79% of the cycles on a machine with no branch prediction). We evaluate how various branch prediction schemes and methods to reduce the mispredict penalty affect the performance of several virtual machine interpreters. Our results show that for current branch predictors, threaded code interpreters cause fewer mispredictions, and are almost twice as fast as switch based interpreters on modern superscalar architectures.

[1]  M. Anton Ertl,et al.  Stack caching for interpreters , 1995, PLDI '95.

[2]  Robert B. K. Dewar,et al.  Indirect threaded code , 1975, Commun. ACM.

[3]  K. Driesen,et al.  Accurate indirect branch prediction , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).

[4]  James R. Bell,et al.  Threaded code , 1973, CACM.

[5]  Todd A. Proebsting Optimizing an ANSI C interpreter with superoperators , 1995, POPL '95.

[6]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[7]  Xavier Leroy,et al.  The ZINC experiment : an economical implementation of the ML language , 1990 .

[8]  Alec Wolman,et al.  The structure and performance of interpreters , 1996, ASPLOS VII.

[9]  M. Anton Ertl A Portable Forth Engine , 1993 .