ArSMART: An Improved SMART NoC Design Supporting Arbitrary-Turn Transmission

SMART NoC, which transmits unconflicted flits to distant processing elements (PEs) in one cycle through the express bypass, is a high-performance NoC design proposed recently. However, if contention occurs, flits with low priority would not only be buffered but also could not fully utilize bypass. Although there exist several routing algorithms that decrease contentions by rounding busy routers and links, they cannot be directly applicable to SMART since it lacks the support for arbitrary-turn (i.e., the number and direction of turns are free of constraints) routing. Thus, in this article, to minimize contentions and further utilize bypass, we propose an improved SMART NoC, called ArSMART, in which arbitrary-turn transmission is enabled. Specifically, ArSMART divides the whole NoC into multiple clusters where the route computation is conducted by the cluster controller and the data forwarding is performed by the bufferless reconfigurable router. Since the long-range transmission in SMART NoC needs to bypass the intermediate arbitration, to enable this feature, we directly configure the input and output ports connection rather than apply hop-by-hop table-based arbitration. To further explore the higher communication capabilities, effective adaptive routing algorithms that are compatible with ArSMART are proposed. The route computation overhead, one of the main concerns for adaptive routing algorithms, is hidden by our carefully designed control mechanism. Compared with the state-of-the-art SMART NoC, the experimental results demonstrate an average reduction of 40.7% in application schedule length and 29.7% in energy consumption.

[1]  Sriram R. Vangal,et al.  A 5-GHz Mesh Interconnect for a Teraflops Processor , 2007, IEEE Micro.

[2]  Wilfried Steiner,et al.  An Evaluation of SMT-Based Schedule Synthesis for Time-Triggered Multi-hop Networks , 2010, 2010 31st IEEE Real-Time Systems Symposium.

[3]  Li-Shiuan Peh,et al.  Breaking the on-chip latency barrier using SMART , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[4]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[5]  Terrence S. T. Mak,et al.  Embedded Transitive Closure Network for Runtime Deadlock Detection in Networks-on-Chip , 2012, IEEE Transactions on Parallel and Distributed Systems.

[6]  Gerard J. M. Smit,et al.  Run-time Spatial Mapping of Streaming Applications to Heterogeneous Multi-Processor Systems , 2009, International Journal of Parallel Programming.

[7]  Peng Chen,et al.  Task mapping on SMART NoC: Contention matters, not the distance , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[8]  Alexandre M. Amory,et al.  Software-Defined Networking Architecture for NoC-based Many-Cores , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).

[9]  Simon J. Hollis,et al.  Skip-links: A dynamically reconfiguring topology for energy-efficient NoCs , 2010, 2010 International Symposium on System on Chip.

[10]  Soultana Ellinidou,et al.  A SDN solution for system-on-chip world , 2018, 2018 Fifth International Conference on Software Defined Systems (SDS).

[11]  Anantha Chandrakasan,et al.  Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI , 2012, DAC Design Automation Conference 2012.

[12]  Israel Koren,et al.  Scheduling imprecise task graphs for real-time applications , 2014, Int. J. Embed. Syst..

[13]  Sheng Ma,et al.  Surf-Bless: A Confined-interference Routing for Energy-Efficient Communication in NoCs , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[14]  Jih-Sheng Shen,et al.  Dynamic Reconfigurable Network-on-Chip Design - Innovations for Computational Processing and Communication , 2010 .

[15]  Ramón Beivide,et al.  SMART++: reducing cost and improving efficiency of multi-hop bypass in NoC routers , 2019, NOCS.

[16]  William Thies,et al.  An empirical characterization of stream programs and its implications for language and compiler design , 2010, 2010 19th International Conference on Parallel Architectures and Compilation Techniques (PACT).

[17]  Onur Mutlu,et al.  A case for bufferless routing in on-chip networks , 2009, ISCA '09.

[18]  Natalie D. Enright Jerger,et al.  SCARAB: A single cycle adaptive routing and bufferless network , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[19]  Fernando Gehm Moraes,et al.  Distributed SDN architecture for NoC-based many-core SoCs , 2019, NOCS.

[20]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[21]  David Blaauw,et al.  Swizzle-Switch Networks for Many-Core Systems , 2012, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[22]  Thomas J. Marlowe,et al.  A task graph model for design and implementation of real-time systems , 1996, Proceedings of ICECCS '96: 2nd IEEE International Conference on Engineering of Complex Computer Systems (held jointly with 6th CSESAW and 4th IEEE RTAW).

[23]  Hyoukjun Kwon,et al.  MAESTRO: An Open-source Infrastructure for Modeling Dataflows within Deep Learning Accelerators , 2018, ArXiv.

[24]  Howard Jay Siegel,et al.  Task execution time modeling for heterogeneous computing systems , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[25]  Liu Cong,et al.  A configurable, programmable and software-defined network on chip , 2014, 2014 IEEE Workshop on Advanced Research and Technology in Industry Applications (WARTIA).

[26]  Ran Ginosar,et al.  The effect of communication and synchronization on Amdahl's law in multicore systems , 2013, Parallel Comput..

[27]  Hyoukjun Kwon,et al.  Rethinking NoCs for spatial neural network accelerators , 2017, 2017 Eleventh IEEE/ACM International Symposium on Networks-on-Chip (NOCS).

[28]  Niraj K. Jha,et al.  Reducing Wire and Energy Overheads of the SMART NoC Using a Setup Request Network , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[29]  Anantha Chandrakasan,et al.  Single-Cycle Multihop Asynchronous Repeated Traversal: A SMART Future for Reconfigurable On-Chip Networks , 2013, Computer.

[30]  Nan Guan,et al.  Contention Minimized Bypassing in SMART NoC , 2020, 2020 25th Asia and South Pacific Design Automation Conference (ASP-DAC).

[31]  Chen Sun,et al.  DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[32]  José Duato,et al.  DCFNoC: A Delayed Conflict-Free Time Division Multiplexing Network on Chip , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[33]  Anantha Chandrakasan,et al.  SMART: A single-cycle reconfigurable NoC for SoC applications , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).