A 5-GHz Mesh Interconnect for a Teraflops Processor

A multicore processor in 65-Nm technology with 80 single-precision, floatingpoint cores delivers performance in excess of a Teraflops while consuming less than 100 W. A 2D on-die mesh interconnection network operating at 5 GHz provides the high-performance communication fabric to connect the cores. The network delivers a bisection bandwidth of 2.56 Terabits per second and a per hop fall-through latency of 1 nanosecond.

[1]  Henry Hoffmann,et al.  The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.

[2]  H. Wilson,et al.  A six-port 30-GB/s nonblocking router component using point-to-point simultaneous bidirectional signaling for high-bandwidth interconnects , 2001, IEEE J. Solid State Circuits.

[3]  A. Alvandpour,et al.  A six-port 57GB/s double-pumped nonblocking router core , 2005, Digest of Technical Papers. 2005 Symposium on VLSI Circuits, 2005..

[4]  Gerard J. M. Smit,et al.  An energy-efficient reconfigurable circuit-switched network-on-chip , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[5]  W. Dally,et al.  Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[6]  Jaehyuk Huh,et al.  Exploiting ILP, TLP, and DLP with the Polymorphous TRIPS Architecture , 2003, IEEE Micro.

[7]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[8]  F. Klass Semi-dynamic and dynamic flip-flops with embedded logic , 1998, 1998 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.98CH36215).

[9]  Kees G. W. Goossens,et al.  Trade Offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip , 2003, DATE.

[10]  T. Mohsenin,et al.  An asynchronous array of simple processors for dsp applications , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[11]  Sharad Malik,et al.  A Power Model for Routers: Modeling Alpha 21364 and InfiniBand Routers , 2003, IEEE Micro.

[12]  Saurabh Dighe,et al.  An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[13]  Sharad Malik,et al.  Power-driven design of router microarchitectures in on-chip networks , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[14]  A. Alvandpour,et al.  A 6.2-GFlops Floating-Point Multiply-Accumulator With Conditional Normalization , 2006, IEEE Journal of Solid-State Circuits.

[15]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[16]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[17]  P. Bai,et al.  A 65nm logic technology featuring 35nm gate lengths, enhanced channel strain, 8 Cu interconnect layers, low-k ILD and 0.57 /spl mu/m/sup 2/ SRAM cell , 2004, IEDM Technical Digest. IEEE International Electron Devices Meeting, 2004..

[18]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[19]  M. Suzuoki,et al.  Overview of the architecture, circuit design, and physical implementation of a first-generation cell processor , 2006, IEEE Journal of Solid-State Circuits.