论文信息 - SPARC-based VLIW testbed

SPARC-based VLIW testbed

The performance of very long instruction word (VLIW) microprocessors depends on the close co-operation between the compiler and the architecture. To design a high-performance VLIW a testbed is required that allows detailed co-evaluation of both compilation techniques and architectural features. The paper introduces a new VLIW testbed based on the SPARC instruction set architecture, which includes an aggressive scheduling compiler and a fast VLIW simulator. The compiler takes gcc-generated optimised SPARC code as input and generates parallelised VLIW code, targeting advanced VLIW architectures. The compiler can generate high-performance VLIW code, especially for non-numerical integer programs. The VLIW code is translated into a dedicated C program for fast and simple compiled simulation which generates detailed data for performance. The authors have performed a comprehensive empirical study on the testbed for both large-resource and small-resource machines. The result shows that as much as a geometric mean of fourfold speedup is obtainable on nontrivial integer benchmarks without using branch probability when performing speculative code motion. Also analysed are the characteristics of the useful and useless ALU operations in each cycle to see how the speedup is obtained. The analysis indicates that around half of the useful ALUs execute speculative instructions whose original paths are taken (thus being "hit"), yet a substantial number of ALUs are also wasted owing to useless speculative execution or copy execution.

Soo-Mook Moon | J.-W. Ahn | H. M. Chung | J. S. Park | SangMi Shim

[1] Soo-Mook Moon,et al. Parallelizing nonnumerical code with selective scheduling and software pipelining , 1997, TOPL.

[2] Georg Sander,et al. Graph Layout through the VCG Tool , 1994, GD.

[3] Toshio Nakatani,et al. Making Compaction-Based Parallelization Affordable , 1993, IEEE Trans. Parallel Distributed Syst..

[4] Soo-Mook Moon,et al. Generalized Multiway Branch Unit for VLIW Microprocessors , 1995, IEEE Trans. Parallel Distributed Syst..

[5] Scott Mahlke,et al. Exploiting Instruction Level Parallelism in the Presence of Conditional Branches , 1997 .