A Low-Latency and Low-Power Hybrid Scheme for On-Chip Networks

Network-on-chip (NoC) has emerged as a vital factor that determines the performance and power consumption of many-core systems. This paper proposes a hybrid scheme for NoCs, which aims at obtaining low latency and low power consumption. In the presented hybrid scheme, a novel switching mechanism, called virtual circuit switching, is proposed to intermingle with circuit switching and packet switching. Flits traveling in virtual circuit switching can traverse the router with only one stage. In addition, multiple virtual circuit-switched (VCS) connections are allowed to share a common physical channel. Moreover, a path allocation algorithm is proposed in this paper to determine VCS connections and circuit-switched connections on a mesh-connected NoC, such that both communication latency and power are optimized. A set of synthetic and real traffic workloads are exploited to evaluate the effectiveness of the proposed hybrid scheme. The experimental results show that our proposed hybrid scheme can efficiently reduce the communication latency and power. For instance, for real traffic workloads, an average of 20.3% latency reduction and 33.2% power saving can be obtained when compared with the baseline NoC. Moreover, when compared with the NoC with virtual point-to-point connections (VIP), the proposed hybrid scheme can reduce the latency by 6.8% with the power decreasing by 11.3% averagely.

[1]  W. Dally,et al.  Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[2]  Luca Benini,et al.  Networks on Chips : A New SoC Paradigm , 2022 .

[3]  S. Borkar,et al.  An 80-Tile Sub-100-W TeraFLOPS Processor in 65-nm CMOS , 2008, IEEE Journal of Solid-State Circuits.

[4]  Rami G. Melhem,et al.  Algorithms for Supporting Compiled Communication , 2003, IEEE Trans. Parallel Distributed Syst..

[5]  George Michelogiannakis,et al.  An analysis of on-chip interconnection networks for large-scale chip multiprocessors , 2010, TACO.

[6]  Alberto L. Sangiovanni-Vincentelli,et al.  Efficient synthesis of networks on chip , 2003, Proceedings 21st International Conference on Computer Design.

[7]  Radu Marculescu,et al.  Energy- and performance-aware mapping for regular NoC architectures , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[8]  David Z. Pan,et al.  UNISM: Unified Scheduling and Mapping for General Networks on Chip , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[9]  Jing Lin,et al.  Power and latency efficient mechanism: a seamless bridge between buffered and bufferless routing in on-chip network , 2011, The Journal of Supercomputing.

[10]  Hamid Sarbazi-Azad,et al.  Virtual Point-to-Point Connections for NoCs , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Saurabh Dighe,et al.  The 48-core SCC Processor: the Programmer's View , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  Srinivasan Murali,et al.  Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[13]  William J. Dally,et al.  Flattened Butterfly Topology for On-Chip Networks , 2007, IEEE Comput. Archit. Lett..

[14]  Niraj K. Jha,et al.  Express virtual channels: towards the ideal interconnection fabric , 2007, ISCA '07.

[15]  Jens Sparsø,et al.  The ReNoC Reconfigurable Network-on-Chip: Architecture, Configuration Algorithms, and Evaluation , 2011, TECS.

[16]  Liang Tang,et al.  Making-a-stop: A new bufferless routing algorithm for on-chip network , 2012, J. Parallel Distributed Comput..

[17]  Lionel M. Ni,et al.  The Turn Model for Adaptive Routing , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[18]  Robert E. Tarjan,et al.  Fibonacci heaps and their uses in improved network optimization algorithms , 1984, JACM.

[19]  Yoon Seok Yang,et al.  WaveSync: A low-latency source synchronous bypass network-on-chip architecture , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[20]  William Thies,et al.  StreamIt: A Language for Streaming Applications , 2002, CC.

[21]  Rami G. Melhem,et al.  Codesign of NoC and Cache Organization for Reducing Access Latency in Chip Multiprocessors , 2012, IEEE Transactions on Parallel and Distributed Systems.

[22]  Natalie D. Enright Jerger,et al.  Circuit-Switched Coherence , 2007, IEEE Computer Architecture Letters.

[23]  Mikko H. Lipasti,et al.  Circuit-Switched Coherence , 2008 .

[24]  Radu Marculescu,et al.  "It's a small world after all": NoC performance optimization via long-range link insertion , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[25]  John Kim,et al.  Low-cost router microarchitecture for on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[26]  Andrew B. Kahng,et al.  Explicit modeling of control and data for improved NoC router estimation , 2012, DAC Design Automation Conference 2012.

[27]  Nan Jiang,et al.  A detailed and flexible cycle-accurate Network-on-Chip simulator , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[28]  David Wentzlaff,et al.  Processor: A 64-Core SoC with Mesh Interconnect , 2010 .

[29]  Nikil Dutt,et al.  FABSYN: floorplan-aware bus architecture synthesis , 2006 .

[30]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .