A system level perspective on branch architecture performance
暂无分享,去创建一个
Dirk Grunwald | Brad Calder | Joel Emer | J. Emer | D. Grunwald | B. Calder
[1] Yale N. Patt,et al. On tuning the microarchitecture of an HPS implementation of the VAX , 1987, MICRO 20.
[2] Yale N. Patt,et al. HPS, a new microarchitecture: rationale and introduction , 1985, MICRO 18.
[3] Scott Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 1992.
[4] Yale N. Patt,et al. Critical issues regarding HPS, a high performance microarchitecture , 1985, MICRO 18.
[5] Arthur B. Maccabe,et al. The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages , 1990, PLDI '90.
[6] ScalesHunter,et al. Single instruction stream parallelism is greater than two , 1991 .
[7] Dirk Grunwald,et al. Next cache line and set prediction , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.
[8] Michael Shebanow,et al. Single instruction stream parallelism is greater than two , 1991, ISCA '91.
[9] B. Ramakrishna Rau,et al. Efficient code generation for horizontal architectures: Compiler techniques and architectural support , 1982, ISCA '82.
[10] Ravi Nair,et al. Optimal 2-Bit Branch Predictors , 1995, IEEE Trans. Computers.
[11] Shlomit S. Pinter,et al. Register allocation with instruction scheduling: a new approach , 1996, Journal of Programming Languages.
[12] S. McFarling. Combining Branch Predictors , 1993 .
[13] Edward S. Davidson,et al. Register requirements of pipelined processors , 1992, ICS '92.
[14] B. Ramakrishna Rau,et al. Iterative modulo scheduling: an algorithm for software pipelining loops , 1994, MICRO 27.
[15] Yale N. Patt,et al. A comprehensive instruction fetch mechanism for a processor supporting speculative execution , 1992, MICRO 1992.
[16] Ken Kennedy,et al. Conversion of control dependence to data dependence , 1983, POPL '83.
[17] Alan Jay Smith,et al. Branch Prediction Strategies and Branch Target Buffer Design , 1995, Computer.
[18] Shlomit S. Pinter,et al. Register allocation with instruction scheduling , 1993, PLDI '93.
[19] Alexandre E. Eichenberger,et al. Stage scheduling: a technique to reduce the register requirements of a modulo schedule , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[20] D. Grunwald,et al. Fast & Accurate Instruction Fetch and Branch Prediction , 1994 .
[21] Alan Eustace,et al. ATOM - A System for Building Customized Program Analysis Tools , 1994, PLDI.
[22] Yale N. Patt,et al. Run-time generation of HPS microinstructions from a VAX instruction stream , 1986, MICRO 19.
[23] Alexandre E. Eichenberger,et al. Stage scheduling: a technique to reduce the register requirements of a module schedule , 1995, MICRO 1995.
[24] Manoj Franklin,et al. A fill-unit approach to multiple instruction issue , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.
[25] Edward S. Davidson,et al. Highly concurrent scalar processing , 1986, ISCA 1986.
[26] S. McFarling,et al. Reducing the cost of branches , 1986, ISCA '86.
[27] Chris H. Perleberg,et al. Branch Target Buffer Design and Optimization , 1993, IEEE Trans. Computers.
[28] Dirk Grunwald,et al. Fast and accurate instruction fetch and branch prediction , 1994, ISCA '94.
[29] Guang R. Gao,et al. A Register Allocation Framework Based on Hierarchical Cyclic Interval Graphs , 1992, CC.
[30] Richard E. Kessler,et al. Page placement algorithms for large real-indexed caches , 1992, TOCS.
[31] S. Peter Song,et al. The PowerPC 604 RISC microprocessor. , 1994, IEEE Micro.
[32] Yale N. Patt,et al. Alternative Implementations of Two-Level Adaptive Branch Prediction , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.
[33] Alexandre E. Eichenberger,et al. Minimum register requirements for a modulo schedule , 1994, MICRO 27.
[34] B. Ramakrishna Rau,et al. Register allocation for software pipelined loops , 1992, PLDI '92.
[35] Gregory J. Chaitin,et al. Register allocation and spilling via graph coloring , 2004, SIGP.
[36] Brian N. Bershad,et al. Avoiding conflict misses dynamically in large direct-mapped caches , 1994, ASPLOS VI.
[37] Grant E. Haab,et al. Enhanced Modulo Scheduling For Loops With Conditional Branches , 1992, [1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 25.
[38] Alexandre E. Eichenberger,et al. Optimum modulo schedules for minimum register requirements , 1995 .
[39] S. L. Zelen. Rationale and Introduction , 1987 .
[40] Alexander Aiken,et al. A Development Environment for Horizontal Microcode , 1986, IEEE Trans. Software Eng..
[41] Joseph T. Rahmeh,et al. Improving the accuracy of dynamic branch prediction using branch correlation , 1992, ASPLOS V.
[42] Dirk Grunwald,et al. Reducing branch costs via branch alignment , 1994, ASPLOS VI.
[43] Yale N. Patt,et al. Hardware Support For Large Atomic Units in Dynamically Scheduled Machines , 1988, [1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21.
[44] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[45] M. Schlansker,et al. On Predicated Execution , 1991 .
[46] Richard A. Huff,et al. Lifetime-sensitive modulo scheduling , 1993, PLDI '93.
[47] Yale N. Patt,et al. A comparison of dynamic branch predictors that use two levels of branch history , 1993, ISCA '93.
[48] B. Ramakrishna Rau,et al. Efficient code generation for horizontal architectures: Compiler techniques and architectural support , 1982, ISCA '82.
[49] David A. Padua,et al. Gated SSA-based demand-driven symbolic analysis for parallelizing compilers , 1995, ICS '95.
[50] Yale N. Patt,et al. A comprehensive instruction fetch mechanism for a processor supporting speculative execution , 1992, MICRO 25.
[51] Prithviraj Banerjee,et al. Advanced compilation techniques in the PARADIGM compiler for distributed-memory multicomputers , 1995, ICS '95.
[52] Vinod Kathail,et al. Height reduction of control recurrences for ILP processors , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.