An on-chip global broadcast network design with equalized transmission lines in the 1024-core era

Based on current trends in multicore scaling, chips with 1024 cores may be available within the next decade. For such number of cores, cache coherence becomes a critical challenge because of the broadcasting operation. For the conventional electrical mesh interconnect network, broadcasting common data to all the cores is difficult to perform efficiently. In this paper, we developed a high-throughput, low-latency and power-efficient equalized dense transmission line (T-line) structure tailored for efficient global broadcasting. Moreover, we propose a hierarchical architecture and an efficient physical structure for 1024-core communication. Evaluation results show high performance of our solution.

[1]  Chung-Kuan Cheng,et al.  High-speed and low-power on-chip global link using continuous-time linear equalizer , 2010, 19th Topical Meeting on Electrical Performance of Electronic Packaging and Systems.

[2]  K.L. Shepard,et al.  Distributed Loss-Compensation Techniques for Energy-Efficient Low-Latency On-Chip Communication , 2007, IEEE Journal of Solid-State Circuits.

[3]  Hui Wu,et al.  A case for globally shared-medium on-chip interconnect , 2011, ISCA.

[4]  Li-Shiuan Peh,et al.  A low-swing crossbar and link generator for low-power networks-on-chip , 2011, 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[5]  Uri C. Weiser,et al.  Interconnect-power dissipation in a microprocessor , 2004, SLIP '04.

[6]  Pavan Kumar Hanumolu,et al.  EQUALIZERS FOR HIGH-SPEED SERIAL LINKS , 2005 .

[7]  K. Bergman,et al.  Insertion loss analysis in a photonic interconnection network for on-chip and off-chip communications , 2008, LEOS 2008 - 21st Annual Meeting of the IEEE Lasers and Electro-Optics Society.

[8]  Ling Zhang,et al.  Design methodology of high performance on-chip global interconnect using terminated transmission-line , 2009, 2009 10th International Symposium on Quality Electronic Design.

[9]  R. Engelbrecht,et al.  DIGEST of TECHNICAL PAPERS , 1959 .

[10]  H. B. Bakoglu,et al.  Circuits, interconnections, and packaging for VLSI , 1990 .

[11]  A. Jose,et al.  Near speed-of-light on-chip interconnects using pulsed current-mode signalling , 2005, Digest of Technical Papers. 2005 Symposium on VLSI Circuits, 2005..

[12]  Qianfan Xu,et al.  12.5 Gbit/s carrier-injection-based silicon micro-ring silicon modulators. , 2007, Optics express.

[13]  Yi Zhu,et al.  Efficient and accurate eye diagram prediction for high speed signaling , 2008, 2008 IEEE/ACM International Conference on Computer-Aided Design.

[14]  Gerard J. M. Smit,et al.  An energy-efficient reconfigurable circuit-switched network-on-chip , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[15]  George Kurian,et al.  ATAC: A 1000-core cache-coherent processor with on-chip optical network , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  K. Okada,et al.  A Bidirectional- and Multi-Drop-Transmission-Line Interconnect for Multipoint-to-Multipoint On-Chip Communications , 2008, IEEE Journal of Solid-State Circuits.

[17]  Xiang Hu,et al.  Prediction and Comparison of High-Performance On-Chip Global Interconnection , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[18]  Vladimir Stojanovic,et al.  Designing Energy-Efficient Low-Diameter On-Chip Networks with Equalized Interconnects , 2009, 2009 17th IEEE Symposium on High Performance Interconnects.

[19]  Eisse Mensink,et al.  Low-Power, High-Speed Transceivers for Network-on-Chip Communication , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[20]  Li-Shiuan Peh,et al.  Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[21]  George Varghese,et al.  Low-swing on-chip signaling techniques: effectiveness and robustness , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[22]  Guang Sun,et al.  Energy-aware run-time mapping for homogeneous NoC , 2010, 2010 International Symposium on System on Chip.

[23]  Bill Lin,et al.  Randomized Partially-Minimal Routing on Three-Dimensional Mesh Networks , 2008, IEEE Computer Architecture Letters.

[24]  Vladimir Stojanovic,et al.  A 4Gb/s/ch 356fJ/b 10mm equalized on-chip interconnect with nonlinear charge-injecting transmit filter and transimpedance receiver in 90nm CMOS , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[25]  Luca P. Carloni,et al.  Networks-on-chip in emerging interconnect paradigms: Advantages and challenges , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[26]  Yu (Kevin) Cao,et al.  What is Predictive Technology Model (PTM)? , 2009, SIGD.

[27]  S. Wong,et al.  Near speed-of-light signaling over on-chip electrical interconnects , 2003 .

[28]  Yuanyuan Zhang,et al.  Performance-Aware Hybrid Algorithm for Mapping IPs onto Mesh-Based Network on Chip , 2011, IEICE Trans. Inf. Syst..

[29]  Saurabh Dighe,et al.  An 80-Tile 1.28TFLOPS Network-on-Chip in 65nm CMOS , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[30]  Payam Heydari,et al.  Design of ultrahigh-speed low-voltage CMOS CML buffers and latches , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[31]  Gerard V. Kopcsay,et al.  A comprehensive 2-D inductance modeling approach for VLSI interconnects: frequency-dependent extraction and compact circuit model synthesis , 2002, IEEE Trans. Very Large Scale Integr. Syst..