Instruction level characterization of the Perfect Club programs on a vector computer

In this paper we study the instruction level characteristics of the Perfect Club programs when compiled and executed on a vector processor. Using a trace driven approach we measure the degree of vectorization of the programs, the vector length used in operations, the operation type distribution, the basic block size and the balance between memory and compute operations. We also study the spill code introduced in the program by the compiler and the pressure on the dispatch unit of the vector architecture.

[1]  Willi Schönauer,et al.  Supercomputers: where are the lost cycles? , 1991, ICS '91.

[2]  Sriram Vajapeyam,et al.  On the instruction-level characteristics of scalar code in highly-vectorized scientific applications , 1992, MICRO 1992.

[3]  Geoffrey C. Fox,et al.  The Perfect Club Benchmarks: Effective Performance Evaluation of Supercomputers , 1989, Int. J. High Perform. Comput. Appl..

[4]  David W. Wall,et al.  Limits of instruction-level parallelism , 1991, ASPLOS IV.

[5]  Corinna G. Lee,et al.  Code optimizers and register organizations for vector architectures , 1992 .

[6]  Sriram Vajapeyam,et al.  On the instruction-level characteristics of scalar code in highly-vectorized scientific applications , 1992, MICRO.

[7]  Gurindar S. Sohi,et al.  An empirical study of the CRAY Y-MP processor using the PERFECT club benchmarks , 1991, [1991] Proceedings. The 18th Annual International Symposium on Computer Architecture.

[8]  Norman P. Jouppi,et al.  The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance , 1989, IEEE Trans. Computers.

[9]  Zarka Cvetanovic,et al.  Characterization of Alpha AXP performance using TP and SPEC workloads , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[10]  Richard M. Russell,et al.  The CRAY-1 computer system , 1978, CACM.

[11]  Michael D. Smith,et al.  Limits on multiple instruction issue , 1989, ASPLOS III.

[12]  Willi Schönauer,et al.  Explaining the Gap between Theoretical Peak Performance and Real Performance for Supercomputer Architectures , 1994, Sci. Program..

[13]  Jack J. Dongarra,et al.  Performance of various computers using standard linear equations software in a FORTRAN environment , 1988, CARN.

[14]  Ruby B. Lee,et al.  Pathlengths of SPEC benchmarks for PA-RISC, MIPS, and SPARC , 1993, Digest of Papers. Compcon Spring.

[15]  Sriram Vajapeyam Instruction-level characterization of the Cray Y-MP processor , 1992 .

[16]  Norman P. Jouppi,et al.  Available instruction-level parallelism for superscalar and superpipelined machines , 1989, ASPLOS III.

[17]  Zhiwei Xu,et al.  Multipipeline Networking for Compound Vector Processing , 1988, IEEE Trans. Computers.