Access region locality for high-bandwidth processor memory system design
暂无分享,去创建一个
[1] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[2] Gary S. Tyson,et al. Improving the accuracy and performance of memory communication through renaming , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[3] Doug Hunt,et al. Advanced performance features of the 64-bit PA-8000 , 1995, Digest of Papers. COMPCON'95. Technologies for the Information Superhighway.
[4] K. Kavi. Cache Memories Cache Memories in Uniprocessors. Reading versus Writing. Improving Performance , 2022 .
[5] Kenneth M. Wilson,et al. Increasing Cache Port Efficiency for Dynamic Superscalar Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[6] Sangyeun Cho,et al. Decoupling local variable accesses in a wide-issue superscalar processor , 1999, ISCA.
[7] Andreas Moshovos,et al. Dynamic Speculation and Synchronization of Data Dependences , 1997, ISCA.
[8] Todd M. Austin,et al. Zero-cycle loads: microarchitecture support for reducing load latency , 1995, Proceedings of the 28th Annual International Symposium on Microarchitecture.
[9] Joel S. Emer,et al. Memory dependence prediction using store sets , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[10] FranklinManoj,et al. High-bandwidth data memory systems for superscalar processors , 1991 .
[11] Mike Johnson,et al. Superscalar microprocessor design , 1991, Prentice Hall series in innovative technology.
[12] S SohiGurindar. Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers , 1990 .
[13] Robert G. Wedig,et al. A performance analysis of automatically managed top of stack buffers , 1987, ISCA '87.
[14] James E. Smith,et al. Complexity-Effective Superscalar Processors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.
[15] Vicki H. Allan,et al. Petri net versus module scheduling for software pipelining , 1995, MICRO 1995.
[16] Carlo H. Séquin,et al. A VLSI RISC , 1982, Computer.
[17] Mikko H. Lipasti,et al. Superspeculative Microarchitecture for Beyond AD 2000 , 1997, Computer.
[18] Michael J. Flynn,et al. Execution Architecture: The DELtran Experiment , 1983, IEEE Transactions on Computers.
[19] Stamatis Vassiliadis,et al. A load-instruction unit for pipelined processors , 1993, IBM J. Res. Dev..
[20] Michael J. Flynn,et al. Computer Architecture: Pipelined and Parallel Processor Design , 1995 .
[21] Douglas W. Clark,et al. A Characterization of Processor Performance in the vax-11/780 , 1984, ISCA '84.
[22] Andreas Moshovos,et al. Streamlining inter-operation memory communication via data dependence prediction , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[23] Christian Piguet,et al. Microprocessor design , 1997 .
[24] Alfred V. Aho,et al. Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.
[25] Mikko H. Lipasti,et al. Value locality and load value prediction , 1996, ASPLOS VII.
[26] James E. Smith,et al. The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[27] Thorsten von Eicken,et al. 技術解説 IEEE Computer , 1999 .
[28] Yale N. Patt,et al. Increasing the instruction fetch rate via multiple branch prediction and a branch address cache , 1993, ICS '93.
[29] H LipastiMikko,et al. Value locality and load value prediction , 1996 .
[30] Kenneth C. Yeager. The Mips R10000 superscalar microprocessor , 1996, IEEE Micro.
[31] David R. Ditzel,et al. Register allocation for free: The C machine stack cache , 1982, ASPLOS I.
[32] S. McFarling. Combining Branch Predictors , 1993 .
[33] Quinn Jacobson,et al. Trace processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[34] Gary S. Tyson,et al. On high-bandwidth data cache design for multi-issue processors , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.
[35] Yale N. Patt,et al. One Billion Transistors, One Uniprocessor, One Chip , 1997, Computer.
[36] Eric Rotenberg,et al. Trace cache: a low latency approach to high bandwidth instruction fetching , 1996, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29.
[37] Gurindar S. Sohi,et al. High-bandwidth data memory systems for superscalar processors , 1991, ASPLOS IV.