论文信息 - Scaling to the end of silicon with EDGE architectures

Scaling to the end of silicon with EDGE architectures

Microprocessor designs are on the verge of a post-RISC era in which companies must introduce new ISAs to address the challenges that modern CMOS technologies pose while also exploiting the massive levels of integration now possible. To meet these challenges, we have developed a new class of ISAs, called explicit data graph execution (EDGE), that will match the characteristics of semiconductor technology over the next decade. The TRIPS architecture is the first instantiation of an EDGE instruction set, a new, post-RISC class of instruction set architectures intended to match semiconductor technology evolution over the next decade, scaling to new levels of power efficiency and high performance.

[1] Dean M. Tullsen,et al. Simultaneous multithreading: Maximizing on-chip parallelism , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[2] H. T. Kung. Why systolic architectures? , 1982, Computer.

[3] Gurindar S. Sohi,et al. Multiscalar processors , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[4] Norman P. Jouppi,et al. The multicluster architecture: reducing cycle time through partitioning , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[5] Karthikeyan Sankaralingam,et al. Universal Mechanisms for Data-Parallel Architectures , 2003, MICRO.

[6] Arvind. Data flow languages and architectures , 1981, ISCA '81.

[7] Alexandru Nicolau,et al. Parallel processing: a smart compiler and a dumb machine , 1984, SIGP.

[8] Kathryn S. McKinley,et al. The Limits of Alias Analysis for Scalar Optimizations , 2004, CC.

[9] William J. Dally,et al. Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.

[10] William J. Dally,et al. A bandwidth-efficient architecture for media processing , 1998, Proceedings. 31st Annual ACM/IEEE International Symposium on Microarchitecture.

[11] Jaehyuk Huh,et al. Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture , 2003, ISCA '03.

[12] Arvind. Data Flow Languages and Architecture , 1981, ISCA.

[13] Scott A. Mahlke,et al. Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[14] Norman P. Jouppi,et al. The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays , 2002, ISCA.

[15] Vivek Sarkar,et al. Baring It All to Software: Raw Machines , 1997, Computer.

[16] Kunle Olukotun,et al. The case for a single-chip multiprocessor , 1996, ASPLOS VII.