An Iterative Computational Technique for Performance Evaluation of Networks-on-Chip

The trend toward integrated many-core architectures makes the network-on-chip (NoC) technology, the on-chip communication infrastructure of choice. However, and as opposed to a simple bus, due to its distributed and complex nature in terms of topology, wire size, routing algorithm, and so on, the timing behavior and thus performance of the infrastructure is difficult to predict. Therefore, one of the important phases in the NoC design flow is performance evaluation, which is to extract performance metrics to verify whether a specific instance from the NoC design space satisfies the requirements of the entire system. In this sense, reducing the time to obtain the NoC performance and consequently speeding-up the design space exploration is one of the keys that can considerably reduce the design-flow time and cost. In an effort toward this direction, we propose in this paper a novel analytical performance evaluation method that can be used in the earliest stages of the design flow, before using time-consuming simulations. The analytical method is used to evaluate the performance of a general purpose NoC and we show that it can predict the router latency, end-to-end per-flow latency, and network saturation point with an accuracy comparable to a cycle-accurate simulation. To systematically analyze the accuracy of our method compared to the corresponding simulation model, we present also an innovative accuracy analysis method.

[1]  Bruno Ciciani,et al.  Performance evaluation of deterministic wormhole routing in k-ary n-cubes , 1998, Parallel Comput..

[2]  Joydeep Ghosh,et al.  A Comprehensive Analytical Model for Wormhole Routng in Multicomputer Systems , 1994, J. Parallel Distributed Comput..

[3]  Alain Greiner,et al.  Multisynchronous and Fully Asynchronous NoCs for GALS Architectures , 2008, IEEE Design & Test of Computers.

[4]  Chita R. Das,et al.  Hypercube Communication Delay with Wormhole Routing , 1994, IEEE Trans. Computers.

[5]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[6]  William J. Dally,et al.  Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.

[7]  Kees G. W. Goossens,et al.  Cost-performance trade-offs in networks on chip: a simulation-based approach , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[8]  Radu Marculescu,et al.  Non-Stationary Traffic Analysis and Its Implications on Multicore Platform Design , 2011, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[9]  Ran Ginosar,et al.  Efficient Link Capacity and QoS Design for Network-on-Chip , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[10]  Radu Marculescu,et al.  An Analytical Approach for Network-on-Chip Performance Analysis , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[11]  Radu Marculescu,et al.  Key research problems in NoC design: a holistic perspective , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[12]  Hamid Sarbazi-Azad,et al.  Analytical Modeling of Wormhole-Routed k-Ary n-Cubes in the Presence of Hot-Spot Traffic , 2001, IEEE Trans. Computers.

[13]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[14]  Alain Greiner,et al.  A generic architecture for on-chip packet-switched interconnections , 2000, DATE '00.

[15]  Rene L. Cruz,et al.  A calculus for network delay, Part I: Network elements in isolation , 1991, IEEE Trans. Inf. Theory.

[16]  Partha Pratim Pande,et al.  Performance evaluation and design trade-offs for network-on-chip interconnect architectures , 2005, IEEE Transactions on Computers.

[17]  Leonard Kleinrock,et al.  Virtual Cut-Through: A New Computer Communication Switching Technique , 1979, Comput. Networks.

[18]  W.-J. Guan,et al.  An analytical model for wormhole routing in multicomputer interconnection networks , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[19]  Wenhua Dou,et al.  Analysis of worst-case delay bounds for best-effort communication in wormhole networks on chip , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[20]  Axel Jantsch,et al.  Scalability of network-on-chip communication architecture for 3-D meshes , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[21]  Mohamed Ould-Khaoua,et al.  A Performance Model for Duato's Fully Adaptive Routing Algorithm in k-Ary n-Cubes , 1999, IEEE Trans. Computers.

[22]  Timo D. Hämäläinen,et al.  On network-on-chip comparison , 2007 .

[23]  Hamid Sarbazi-Azad,et al.  An Analytical Model of Adaptive Wormhole Routing in Hypercubes in the Presence of Hot Spot Traffic , 2001, IEEE Trans. Parallel Distributed Syst..

[24]  Kees G. W. Goossens,et al.  Enabling application-level performance guarantees in network-based systems on chip by applying dataflow analysis , 2009, IET Comput. Digit. Tech..

[25]  Partha Pratim Pande,et al.  Networks-on-Chip in a Three-Dimensional Environment: A Performance Evaluation , 2009, IEEE Transactions on Computers.

[26]  Luca Benini,et al.  Legacy SystemC co-simulation of multi-processor systems-on-chip , 2002, Proceedings. IEEE International Conference on Computer Design: VLSI in Computers and Processors.

[27]  R. Agrawal,et al.  Performance Bounds for Flow Control Protocols 1 , 1998 .

[28]  Timo Hämäläinen,et al.  Benchmarking Mesh and Hierarchical Bus Networks in System-on-Chip Context , 2005, SAMOS.

[29]  Mohamed Ould-Khaoua,et al.  A performance model for wormhole-switched interconnection networks under self-similar traffic , 2004, IEEE Transactions on Computers.