Pipelined Java Virtual Machine Interpreters

The performance of a Java Virtual Machine (JVM) interpreter running on a very long instruction word (VLIW) processor can be improved by means of pipelining. While one bytecode is in its execute stage, the next bytecode is in its decode stage, and the next bytecode is in its fetch stage. The paper describes how we implemented threading and pipelining by rewriting the source code of the interpreter and several modifications in the compiler. Experiments for evaluating the effectiveness of pipelining are described. Pipelining improves the execution speed of a threaded interpreter by 19.4% in terms of instruction count and 14.4% in terms of cycle count. Most of the simple bytecodes, like additions and multiplications, execute in four cycles. This number corresponds to the branch latency of our target VLIW processor. Thus most of the code of the interpreter is executed in branch delay slots.

[1]  Frank Yellin,et al.  The Java Virtual Machine Specification , 1996 .

[2]  Vicki H. Allan,et al.  Software pipelining , 1995, CSUR.

[3]  Paul Klint,et al.  Interpretation Techniques , 1981, Softw. Pract. Exp..

[4]  M. Gschwind,et al.  Javavm Implementation: Compilers versus Hardware , 1998 .

[5]  Andreas Krall,et al.  Efficient JavaVM just-in-time compilation , 1998, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192).

[6]  Mike O'Connor,et al.  PicoJava: A Direct Execution Engine For Java Bytecode , 1998, Computer.

[7]  Lex Augusteijn,et al.  A code compression system based on pipelined interpreters , 1999, Softw. Pract. Exp..

[8]  M. Anton Ertl,et al.  Stack caching for interpreters , 1995, PLDI '95.

[9]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[10]  Henk A. Dijkstra,et al.  The trimedia tm-1 pci vliw media processor , 1996 .

[11]  Ian Piumarta,et al.  Optimizing direct threaded code by selective inlining , 1998, PLDI 1998.

[12]  Ali-Reza Adl-Tabatabai,et al.  Fast, effective code generation in a just-in-time Java compiler , 1998, PLDI.

[13]  Todd A. Proebsting Optimizing an ANSI C interpreter with superoperators , 1995, POPL '95.

[14]  M. Anton Ertl Implementation of Stack-Based Languages on Register Machines , 1996 .

[15]  James R. Bell,et al.  Threaded code , 1973, CACM.

[16]  Lex Augusteijn,et al.  Instruction Scheduling for TriMedia , 1999, J. Instr. Level Parallelism.

[17]  Lex Augusteijn,et al.  A code compression system based on pipelined interpreters , 1999 .