Headfirst sliding routing: A time-based routing scheme for bus-NoC hybrid 3-D architecture

A contact-less approach that connects chips in vertical dimension has a great potential to customize components in 3-D chip multiprocessors (CMPs), assuming card-style components inserted to a single cartridge communicate each other wirelessly using inductive-coupling technology. To simplify the vertical communication interfaces, static Time Division Multiple Access (TDMA) is used for the vertical broadcast buses, while arbitrary or customized topologies can be used for intra-chip networks. In this paper, we propose the Headfirst sliding routing scheme to overcome the simple static TDMA-based vertical buses. Each vertical bus grants a communication time-slot for different chips at the same time periodically, which means these buses work with different phases. Depending on the current time, packets are routed toward the best vertical bus (elevator) just before the elevator acquires its communication time-slot. Network simulations show that Headfirst sliding routing reduces the communication latency by up to 32.7%, and full-system CMP simulations show that it reduces application execution time by 9.9%. Synthesis results show that the area and critical path delay overheads are modest.

[1]  Fredrik Larsson,et al.  Simics: A Full System Simulation Platform , 2002, Computer.

[2]  Yang Song,et al.  System-in-silicon architecture and its application to H.264/AVC motion estimation for 1080HDTV , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[3]  Tadahiro Kuroda,et al.  A 0.14pJ/b Inductive-Coupling Inter-Chip Data Transceiver with Digitally-Controlled Precise Pulse Shaping , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[4]  Daisuke Sasaki,et al.  A vertical bubble flow network using inductive-coupling for 3-D CMPs , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.

[5]  Ali Ahmadinia,et al.  Low power heterogeneous 3D Networks-on-Chip architectures , 2011, 2011 International Conference on High Performance Computing & Simulation.

[6]  Partha Pratim Pande,et al.  Networks-on-Chip in a Three-Dimensional Environment: A Performance Evaluation , 2009, IEEE Transactions on Computers.

[7]  Bill Lin,et al.  Randomized Partially-Minimal Routing on Three-Dimensional Mesh Networks , 2008, IEEE Computer Architecture Letters.

[8]  Eby G. Friedman,et al.  3-D Topologies for Networks-on-Chip , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[9]  H. Jin,et al.  - 3-The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance , 1999 .

[10]  Tadahiro Kuroda,et al.  A 0.025–0.45 W 60%-Efficiency Inductive-Coupling Power Transceiver With 5-Bit Dual-Frequency Feedforward Control for Non-Contact Memory Cards , 2012, IEEE Journal of Solid-State Circuits.

[11]  Tadahiro Kuroda,et al.  MuCCRA-Cube: A 3D dynamically reconfigurable processor with inductive-coupling link , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[12]  Chita R. Das,et al.  MIRA: A Multi-layered On-Chip Interconnect Router Architecture , 2008, 2008 International Symposium on Computer Architecture.

[13]  Kuroda Tadahiro,et al.  A 1Tb/s 3W Inductive-Coupling Transceiver for Inter-Chip Clock and Data Link , 2006 .

[14]  Wolfgang Ziegler 3D Integration for NoC-based SoC Architectures , 2011, Integrated Circuits and Systems.

[15]  Chita R. Das,et al.  A hybrid SoC interconnect with dynamic TDMA-based transaction-less buses and on-chip networks , 2006, 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06).

[16]  Masoud Daneshtalab,et al.  HIBS — Novel inter-layer bus structure for stacked architectures , 2012, 2011 IEEE International 3D Systems Integration Conference (3DIC), 2011 IEEE International.

[17]  Chita R. Das,et al.  A novel dimensionally-decomposed router for on-chip communication in 3D architectures , 2007, ISCA '07.

[18]  Tadahiro Kuroda,et al.  Simultaneous 6Gb/s data and 10mW power transmission using nested clover coils for non-contact memory card , 2010, 2010 Symposium on VLSI Circuits.

[19]  Lei Jiang,et al.  Die Stacking (3D) Microarchitecture , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[20]  Doug Burger,et al.  An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches , 2002, ASPLOS X.

[21]  Milo M. K. Martin,et al.  Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset , 2005, CARN.

[22]  K. Warner,et al.  Three-dimensional integrated circuits for low-power, high-bandwidth systems on a chip , 2001, 2001 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC (Cat. No.01CH37177).

[23]  Jian Xu,et al.  Demystifying 3D ICs: the pros and cons of going vertical , 2005, IEEE Design & Test of Computers.

[24]  Tadahiro Kuroda,et al.  6W/25mm2 inductive power transfer for non-contact wafer-level testing , 2011, 2011 IEEE International Solid-State Circuits Conference.

[25]  Tadahiro Kuroda,et al.  Chip-to-chip power delivery by inductive coupling with ripple canceling scheme , 2008 .

[26]  Mahmut T. Kandemir,et al.  Design and Management of 3D Chip Multiprocessors Using Network-in-Memory , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[27]  Hideharu Amano,et al.  An Effective Design of Deadlock-Free Routing Algorithms Based on 2D Turn Model for Irregular Networks , 2007, IEEE Transactions on Parallel and Distributed Systems.

[28]  Hideharu Amano,et al.  Fat H-Tree: A Cost-Efficient Tree-Based On-Chip Network , 2009, IEEE Trans. Parallel Distributed Syst..

[29]  D.D. Antono,et al.  1.27Gb/s/pin 3mW/pin wireless superconnect (WSC) interface scheme , 2003, 2003 IEEE International Solid-State Circuits Conference, 2003. Digest of Technical Papers. ISSCC..