Parallelization of Control Recurrences for ILP Processors
暂无分享,去创建一个
[1] Norman P. Jouppi,et al. Available instruction-level parallelism for superscalar and superpipelined machines , 1989, ASPLOS III.
[2] Peter Y.-T. Hsu,et al. Overlapped loop support in the Cydra 5 , 1989, ASPLOS III.
[3] F. H. Mcmahon,et al. The Livermore Fortran Kernels: A Computer Test of the Numerical Performance Range , 1986 .
[4] James C. Dehnert,et al. Overlapped loop support in the Cydra 5 , 1989, ASPLOS 1989.
[5] B. Ramakrishna Rau. Cydra 5 directed dataflow architecture , 1988, Digest of Papers. COMPCON Spring 88 Thirty-Third IEEE Computer Society International Conference.
[6] Mike Schlansker,et al. Parallelization of loops with exits on pipelined architectures , 1990, Proceedings SUPERCOMPUTING '90.
[7] Alexandru Nicolau. Parallelism, memory anti-aliasing and correctness for trace scheduling compilers (disambiguation, flow-analysis, compaction) , 1984 .
[8] John R. Ellis,et al. Bulldog: A Compiler for VLIW Architectures , 1986 .
[9] Joseph A. Fisher,et al. Trace Scheduling: A Technique for Global Microcode Compaction , 1981, IEEE Transactions on Computers.
[10] B. Ramakrishna Rau,et al. The Cydra 5 departmental supercomputer: design philosophies, decisions, and trade-offs , 1989, Computer.
[11] B. R. Rau,et al. Code Generation Schemas for Modulo Scheduled DO-Loops and WHILE-Loops , 1992 .
[12] Scott A. Mahlke,et al. Reverse If-Conversion , 1993, PLDI '93.
[13] Edward S. Davidson,et al. Highly concurrent scalar processing , 1986, ISCA 1986.
[14] Vinod Kathail,et al. Acceleration of First and Higher Order Recurrences on Processors with Instruction Level Parallelism , 1993, LCPC.
[15] James C. Dehnert,et al. Compiling for the Cydra , 1993, The Journal of Supercomputing.
[16] B. Ramakrishna Rau,et al. Data Flow and Dependence Analysis for Instruction Level Parallelism , 1991, LCPC.
[17] M. Schlansker,et al. On Predicated Execution , 1991 .
[18] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS 1987.
[19] Alexandru Nicolau,et al. Measuring the Parallelism Available for Very Long Instruction Word Architectures , 1984, IEEE Transactions on Computers.
[20] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[21] Woody Lichtenstein,et al. The multiflow trace scheduling compiler , 1993, The Journal of Supercomputing.
[22] Kemal Ebcioglu,et al. An efficient resource-constrained global scheduling technique for superscalar and VLIW processors , 1992, MICRO 1992.
[23] B. Ramakrishna Rau,et al. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.
[24] Robert P. Colwell,et al. A VLIW architecture for a trace scheduling compiler , 1987, ASPLOS.
[25] David L. Kuck,et al. The Structure of Computers and Computations , 1978 .
[26] Scott A. Mahlke,et al. The superblock: An effective technique for VLIW and superscalar compilation , 1993, The Journal of Supercomputing.
[27] Joe D. Warren,et al. The program dependence graph and its use in optimization , 1984, TOPL.
[28] Norman P. Jouppi,et al. Available instruction-level parallelism for superscalar and superpipelined machines , 1989, ASPLOS 1989.
[29] Vinod Kathail,et al. Height reduction of control recurrences for ILP processors , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.
[30] Monica S. Lam,et al. Limits of control flow on parallelism , 1992, ISCA '92.
[31] Scott Mahlke,et al. Sentinel scheduling: a model for compiler-controlled speculative execution , 1993 .
[32] Scott A. Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.