Performance Analysis and Design Space Exploration of On-Chip Interconnection Networks

The advance of semiconductor technology, which has led to more than one billion transistors on a single chip, has enabled designers to integrate dozens of IP (intellectual property) blocks together ...

[1]  Axel Jantsch,et al.  Models of computation and languages for embedded system design , 2005 .

[2]  Wei Zhao,et al.  A General Framework for Parameterized Schedulability Bound Analysis of Real-Time Systems , 2010, IEEE Transactions on Computers.

[3]  Wolfgang Fischer,et al.  The Markov-Modulated Poisson Process (MMPP) Cookbook , 1993, Perform. Evaluation.

[4]  Yun Pan,et al.  A general communication performance evaluation model based on routing path decomposition , 2011, Journal of Zhejiang University SCIENCE C.

[5]  Axel Jantsch,et al.  Network on Chip : An architecture for billion transistor era , 2000 .

[6]  Cheng-Shang Chang,et al.  Performance guarantees in communication networks , 2000, Eur. Trans. Telecommun..

[7]  Sander Stuijk,et al.  Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[8]  C. Sanges,et al.  A recursively scalable network VLSI implementation , 1988, Future Gener. Comput. Syst..

[9]  G. Cox,et al.  ~ " " " ' l I ~ " " -" . : -· " J , 2006 .

[10]  Ran Ginosar,et al.  Network Delays and Link Capacities in Application-Specific Wormhole NoCs , 2007, VLSI Design.

[11]  Hamid Sarbazi-Azad,et al.  An accurate mathematical performance model of adaptive routing in the star graph , 2008, Future Gener. Comput. Syst..

[12]  John P. Lehoczky,et al.  The rate monotonic scheduling algorithm: exact characterization and average case behavior , 1989, [1989] Proceedings. Real-Time Systems Symposium.

[13]  Mohamed Ould-Khaoua,et al.  A performance model for wormhole-switched interconnection networks under self-similar traffic , 2004, IEEE Transactions on Computers.

[14]  Myron Hlynka,et al.  Queueing Networks and Markov Chains (Modeling and Performance Evaluation With Computer Science Applications) , 2007, Technometrics.

[15]  Ali Dasdan,et al.  Experimental analysis of the fastest optimum cycle ratio and mean algorithms , 2004, TODE.

[16]  Karam S. Chatha,et al.  A power and performance model for network-on-chip architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[17]  Jean-Yves Le Boudec,et al.  Network Calculus: A Theory of Deterministic Queuing Systems for the Internet , 2001 .

[18]  Hui Zhang,et al.  Service disciplines for guaranteed performance service in packet-switching networks , 1995, Proc. IEEE.

[19]  Wen-mei W. Hwu,et al.  Automatic Discovery of Coarse-Grained Parallelism in Media Applications , 2007, Trans. High Perform. Embed. Archit. Compil..

[20]  Ed F. Deprettere,et al.  A Methodology for Architecture Exploration of Heterogeneous Signal Processing Systems , 2001, J. VLSI Signal Process..

[21]  Sander Stuijk,et al.  Latency Minimization for Synchronous Data Flow Graphs , 2007 .

[22]  Wenhua Dou,et al.  Analysis of worst-case delay bounds for best-effort communication in wormhole networks on chip , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[23]  Rudy Lauwereins,et al.  Geometric parallelism and cyclo-static data flow in GRAPE-II , 1994, Proceedings of IEEE 5th International Workshop on Rapid System Prototyping.

[24]  Sander Stuijk,et al.  Scenario-aware dataflow: Modeling, analysis and implementation of dynamic applications , 2011, 2011 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation.

[25]  Edward A. Lee,et al.  Dataflow process networks , 2001 .

[26]  Alan Burns,et al.  Real-Time Communication Analysis for On-Chip Networks with Wormhole Switching , 2008, Second ACM/IEEE International Symposium on Networks-on-Chip (nocs 2008).

[27]  Leonard Kleinrock,et al.  An analytical model for wormhole routing with finite size input buffers , 1997 .

[28]  Chita R. Das,et al.  Design and analysis of an NoC architecture from performance, reliability and energy perspective , 2008 .

[29]  Yvain Thonnart,et al.  An analytical method for evaluating Network-on-Chip performance , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[30]  Yvain Thonnart,et al.  Analytical computation of packet latency in a 2D-mesh NoC , 2009, 2009 Joint IEEE North-East Workshop on Circuits and Systems and TAISA Conference.

[31]  Rolf Ernst,et al.  Design space exploration and system optimization with SymTA/S - symbolic timing analysis for systems , 2004, 25th IEEE International Real-Time Systems Symposium.

[32]  Axel Jantsch,et al.  Buffer Optimization in Network-on-Chip Through Flow Regulation , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[33]  Joseph T. Buck A dynamic dataflow model suitable for efficient mixed hardware and software implementations of DSP applications , 1994, CODES.

[34]  Edward A. Lee,et al.  Scheduling dynamic dataflow graphs with bounded memory using the token flow model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[35]  Rajesh K. Gupta,et al.  Faster maximum and minimum mean cycle algorithms for system-performance analysis , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[36]  Chita R. Das,et al.  Hypercube Communication Delay with Wormhole Routing , 1994, IEEE Trans. Computers.

[37]  William J. Dally,et al.  Research Challenges for On-Chip Interconnection Networks , 2007, IEEE Micro.

[38]  Krzysztof Pawlikowski,et al.  Steady-state simulation of queueing processes: survey of problems and solutions , 1990, CSUR.

[39]  Kees Goossens,et al.  Performance Analysis of Soft and Hard Single-Hop and Multi-Hop Circuit-Switched Interconnects for FPGAs , 2008 .

[40]  Jean A. Peperstraete,et al.  Cycle-static dataflow , 1996, IEEE Trans. Signal Process..

[41]  Joseph Y.-T. Leung,et al.  On the complexity of fixed-priority scheduling of periodic, real-time tasks , 1982, Perform. Evaluation.

[42]  Avinoam Kolodny,et al.  Static timing analysis for modeling QoS in networks-on-chip , 2011, J. Parallel Distributed Comput..

[43]  Jian Wang,et al.  A Novel Analytical Model for Network-on-Chip using Semi-Markov Process , 2011 .

[44]  Orlando Moreira,et al.  Self-Timed Scheduling Analysis for Real-Time Applications , 2007, EURASIP J. Adv. Signal Process..

[45]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[46]  Zhonghai Lu Cross clock-domain TDM virtual circuits for networks on chips , 2011, Proceedings of the Fifth ACM/IEEE International Symposium.

[47]  Wenhua Dou,et al.  Applying network calculus for performance analysis of self-similar traffic in on-chip networks , 2009, CODES+ISSS '09.

[48]  Rene L. Cruz,et al.  A calculus for network delay, Part I: Network elements in isolation , 1991, IEEE Trans. Inf. Theory.

[49]  Anujan Varma,et al.  Latency-rate servers: a general model for analysis of traffic scheduling algorithms , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[50]  Wenhua Dou,et al.  Analysis of Worst-Case Delay Bounds for On-Chip Packet-Switching Networks , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[51]  Christos Alexopoulos,et al.  Implementing the batch means method in simulation experiments , 1996, Winter Simulation Conference.

[52]  Li-Shiuan Peh,et al.  Polaris: A System-Level Roadmapping Toolchain for On-Chip Interconnection Networks , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[53]  J. R. Jackson Networks of Waiting Lines , 1957 .

[54]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[55]  Alan Burns,et al.  Real-Time Communication Analysis with a Priority Share Policy in On-Chip Networks , 2009, 2009 21st Euromicro Conference on Real-Time Systems.

[56]  Tarek A. El-Ghazawi,et al.  Performance evaluation and design tradeoffs of on-chip interconnect architectures , 2011, Simul. Model. Pract. Theory.

[57]  Radu Marculescu,et al.  On-chip traffic modeling and synthesis for MPEG-2 video applications , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[58]  Radu Marculescu,et al.  System-Level Buffer Allocation for Application-Specific Networks-on-Chip Router Design , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[59]  Shuvra S. Bhattacharyya,et al.  Embedded Multiprocessors: Scheduling and Synchronization , 2000 .

[60]  Zhonghai Lu,et al.  Feasibility analysis of messages for on-chip networks using wormhole routing , 2005, Proceedings of the ASP-DAC 2005. Asia and South Pacific Design Automation Conference, 2005..

[61]  Zhonghai Lu,et al.  QoS scheduling for NoCs: Strict Priority Queueing versus Weighted Round Robin , 2010, 2010 IEEE International Conference on Computer Design.

[62]  Alan Burns,et al.  Schedulability analysis and task mapping for real-time on-chip communication , 2010, Real-Time Systems.

[63]  Kees G. W. Goossens,et al.  Enabling application-level performance guarantees in network-based systems on chip by applying dataflow analysis , 2009, IET Comput. Digit. Tech..

[64]  E.A. Lee,et al.  Synchronous data flow , 1987, Proceedings of the IEEE.

[65]  Rene L. Cruz,et al.  A calculus for network delay, Part II: Network analysis , 1991, IEEE Trans. Inf. Theory.

[66]  Matt W. Mutka,et al.  Priority based real-time communication for large scale wormhole networks , 1994, Proceedings of 8th International Parallel Processing Symposium.

[67]  Wenhua Dou,et al.  Worst-Case Flit and Packet Delay Bounds in Wormhole Networks on Chip , 2009, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[68]  Stephen W. Keckler,et al.  Realistic Workload Characterization and Analysis for Networks-on-Chip Design , 2009 .

[69]  Axel Jantsch,et al.  Flow regulation for on-chip communication , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[70]  Radu Marculescu,et al.  An Analytical Approach for Network-on-Chip Performance Analysis , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[71]  Andrew B. Kahng,et al.  ORION 2.0: A Power-Area Simulator for Interconnection Networks , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[72]  Gerard J. M. Smit,et al.  Efficient Computation of Buffer Capacities for Cyclo-Static Dataflow Graphs , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[73]  W.-J. Guan,et al.  An analytical model for wormhole routing in multicomputer interconnection networks , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[74]  Nikil Dutt,et al.  On-Chip Interconnect with aelite: Composable and Predictable Systems , 2010 .

[75]  Sander Stuijk,et al.  Throughput Analysis of Synchronous Data Flow Graphs , 2006, Sixth International Conference on Application of Concurrency to System Design (ACSD'06).

[76]  Edward D. Lazowska,et al.  Quantitative system performance - computer system analysis using queueing network models , 1983, Int. CMG Conference.