Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation
暂无分享,去创建一个
[1] Scott A. Mahlke,et al. Reverse If-Conversion , 1993, PLDI '93.
[2] B. Ramakrishna Rau,et al. The Cydra 5 departmental supercomputer: design philosophies, decisions, and trade-offs , 1989, Computer.
[3] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.
[4] David F. Bacon,et al. Compiler transformations for high-performance computing , 1994, CSUR.
[5] Richard A. Huff,et al. Lifetime-sensitive modulo scheduling , 1993, PLDI '93.
[6] Scott A. Mahlke,et al. Compiler code transformations for superscalar-based high-performance systems , 1992, Proceedings Supercomputing '92.
[7] David Bernstein,et al. Dynamic memory disambiguation for array references , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.
[8] Jack W. Davidson,et al. Memory access coalescing: a technique for eliminating redundant memory accesses , 1994, PLDI '94.
[9] B. Ramakrishna Rau,et al. Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing , 1981, MICRO 14.
[10] Manuel E. Benitez. Register allocation and phase interactions in retargetable optimizing compilers , 1994 .
[11] John Paul Shen,et al. Speculative disambiguation: a compilation technique for dynamic memory disambiguation , 1994, ISCA '94.
[12] David W. Wall,et al. Limits of instruction-level parallelism , 1991, ASPLOS IV.
[13] F. H. Mcmahon,et al. The Livermore Fortran Kernels: A Computer Test of the Numerical Performance Range , 1986 .
[14] David B. Whalley,et al. Ease: an environment for architecture study and experimentation , 1990, SIGMETRICS '90.
[15] Scott A. Mahlke,et al. Superblock formation using static program analysis , 1993, Proceedings of the 26th Annual International Symposium on Microarchitecture.
[16] B. Ramakrishna Rau,et al. Register allocation for software pipelined loops , 1992, PLDI '92.
[17] David A. Padua,et al. Dependence graphs and compiler optimizations , 1981, POPL '81.
[18] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[19] Bruce R. Childers,et al. Memory bandwidth optimizations for wide-bus machines , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.
[20] Jian Wang,et al. GURPR*: a new global software pipelining algorithm , 1991, MICRO 24.
[21] Vicki H. Allan,et al. Software pipelining: an evaluation of enhanced pipelining , 1991, MICRO 24.
[22] Wen-mei W. Hwu,et al. The benefit of predicated execution for software pipelining , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.
[23] Christos A. Papachristou,et al. A VLIW architecture based on shifting register files , 1993, MICRO 1993.
[24] Mike Schlansker,et al. Parallelization of loops with exits on pipelined architectures , 1990, Proceedings SUPERCOMPUTING '90.
[25] Shlomo Weiss,et al. A study of scalar compilation techniques for pipelined supercomputers , 1987, ASPLOS 1987.
[26] Gerry Kane,et al. MIPS RISC Architecture , 1987 .
[27] Manuel E. Benitez,et al. A portable global optimizer and linker , 1988, PLDI '88.