Hyperblock performance optimizations for ILP processors
暂无分享,去创建一个
[1] erDavid,et al. Dynamic Memory Disambiguation Using the Memory Con ict Bu er , 1994 .
[2] Sadun Anik,et al. Architectural and Software Support for Executing Numerical Applications on High Performance Computers , 1993 .
[3] Po-Hua Chang,et al. Compiler support for multiple-instruction-issue architectures , 1991 .
[4] Mike Schlansker,et al. Parallelization of loops with exits on pipelined architectures , 1990, Proceedings SUPERCOMPUTING '90.
[5] Ken Kennedy,et al. Conversion of control dependence to data dependence , 1983, POPL '83.
[6] Scott Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.
[7] John Paul Shen,et al. An instruction-level performance analysis of the Multiflow TRACE 14/300 , 1991, MICRO 24.
[8] Vinod Kathail,et al. Height reduction of control recurrences for ILP processors , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.
[9] Norman P. Jouppi,et al. Available instruction-level parallelism for superscalar and superpipelined machines , 1989, ASPLOS III.
[10] Scott A. Mahlke,et al. Profile‐guided automatic inline expansion for C programs , 1992, Softw. Pract. Exp..
[11] Scott A. Mahlke,et al. A comparison of full and partial predicated execution support for ILP processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[12] David C. Lin. Compiler Support For Predicated Execution In Superscalar Processors , 1992 .
[13] R. A. Towle,et al. Control and data dependence for program transformations. , 1976 .
[14] Roger A. Bringmann. A TEMPLATE FOR CODE GENERATOR DEVELOPMENT USING THE IMPACT-I C COMPILER , 1992 .
[15] M. Schlansker,et al. On Predicated Execution , 1991 .
[16] Edward S. Davidson,et al. Highly concurrent scalar processing , 1986, ISCA 1986.
[17] Michael D. Smith,et al. Limits on multiple instruction issue , 1989, ASPLOS III.
[18] Wen-mei W. Hwu,et al. Achieving High Instruction Cache Performance With An Optimizing Compiler , 1989, The 16th Annual International Symposium on Computer Architecture.
[19] Yoji Yamada,et al. Data relocation and prefetching for programs with large data sets , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.
[20] Richard E. Hank,et al. Machine Independent Register Allocation For The Impact-I C Compiler , 1993 .
[21] David Mark Gallagher,et al. Memory disambiguation to facilitate instruction-level parallelism compilation , 1995 .
[22] B. R. Rau,et al. The Cydra 5 Departmental Supercomputer: design philosophies, decisions and trade-offs , 1989, [1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track.
[23] Richard E. Hank,et al. Region-based compilation: an introduction and motivation , 1995, MICRO 1995.
[24] Scott Mahlke,et al. Design And Implementation Of A Portable Global Code Optimizer , 1991 .
[25] Scott A. Mahlke,et al. The Importance of Prepass Code Scheduling for Superscalar and Superpipelined Processors , 1995, IEEE Trans. Computers.
[26] Jr. William Yu-Wei Chen,et al. Data preload for superscalar and VLIW processors , 1993 .