Using conditional execution to exploit instruction level concurrency

Multiple‐instruction‐issue processors seek to improve performance over scalar RISC processors by providing multiple pipelined functional units in order to fetch, decode and execute several instructions per cycle. The process of identifying instructions which can be executed in parallel and distributing them between the available functional units is referred to as instruction scheduling. This paper describes a simple compile‐time scheduling technique, called conditional compaction, which uses the concept of conditional execution to move instructions across basic block boundaries. It then presents the results of an investigation into the performance of the scheduling technique using C benchmark programs scheduled for machines with different functional unit configurations. This paper represents the culmination of our investigation into how much performance improvement can be obtained using conditional execution as the sole scheduling technique.

[1]  Bruce D. Shriver,et al.  Local Microcode Compaction Techniques , 1980, CSUR.

[2]  Gordon B. Steven A novel effective address calculation mechanism for RISC microprocessors , 1988, CARN.

[3]  Shlomo Weiss,et al.  A study of scalar compilation techniques for pipelined supercomputers , 1987, ASPLOS 1987.

[4]  Michael D. Smith,et al.  Efficient superscalar performance through boosting , 1992, ASPLOS V.

[5]  Joseph A. Fisher,et al.  Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.

[6]  Bruce D. Shriver,et al.  Some Experiments in Local Microcode Compaction for Horizontal Machines , 1981, IEEE Transactions on Computers.

[7]  G. B. Steven,et al.  Utilising low level parallelism in general purpose code: the HARP project , 1990 .

[8]  Toshio Nakatani,et al.  A new compilation technique for parallelizing loops with unpredictable branches on a VLIW architecture , 1990 .

[9]  Rod Adams,et al.  Harp: A Statically Scheduled Multiple-instruction Issue Architecture And Its Compiler , 1994, Proceedings. Second Euromicro Workshop on Parallel and Distributed Processing.

[10]  Mike Johnson,et al.  Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.

[11]  John R. Ellis,et al.  Bulldog: A Compiler for VLIW Architectures , 1986 .

[12]  Gordon B. Steven,et al.  HARP: A parallel pipelined RISC processor , 1989, Microprocess. Microsystems.

[13]  G. B. Steven,et al.  An evaluation of the iHARP multiple instruction issue processor , 1994, Proceedings of Twentieth Euromicro Conference. System Architecture and Integration.

[14]  Kemal Ebcioglu,et al.  An efficient resource-constrained global scheduling technique for superscalar and VLIW processors , 1992, MICRO 1992.

[15]  G. B. Steven,et al.  iHARP: a multiple instruction issue processor , 1992 .

[16]  Reinhold Weicker,et al.  An overview of common benchmarks , 1990, Computer.

[17]  Christian Piguet,et al.  Microprocessor design , 1997 .