论文信息 - A design methodology for efficient application-specific on-chip interconnects

A design methodology for efficient application-specific on-chip interconnects

As the level of chip-integration continues to advance at a fast pace, the desire for efficient interconnects - whether on-chip or off-chip - is rapidly increasing. Traditional interconnects like buses, point-to-point wires, and regular topologies may suffer from poor resource sharing in the time and space domains, leading to high contention or low resource utilization. In this paper, we propose a design methodology for constructing networks for special-purpose computer systems with well-behaved (known) communication characteristics. A temporal and spatial model is proposed to define the sufficient condition for contention-free communication. Based upon this model, a design methodology using a recursive bisection technique is applied to systematically partition a parallel system such that the required number of links and switches is minimized while achieving low contention. Results show that the design methodology can generate more optimized on-chip networks with up to 60 percent fewer resources than meshes or tori while providing blocking performance closer to that of a fully connected crossbar.

Timothy Mark Pinkston | Wai Hong Ho | T. Pinkston | W. Ho

[1] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .

[2] David H. Bailey,et al. The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[3] Stephen D. Brown,et al. Flexibility of interconnection structures for field-programmable gate arrays , 1991 .

[4] Luciano Lavagno,et al. Hardware-software codesign of embedded systems , 1994, IEEE Micro.

[5] Wayne Wolf,et al. Hardware-software co-design of embedded systems , 1994, Proc. IEEE.

[6] Cécile Germain,et al. Static Communications in Parallel Scientific Propgrams , 1994, PARLE.

[7] William Gropp,et al. Users guide for mpich, a portable implementation of MPI , 1996 .

[8] Ramesh Subramonian,et al. LogP: a practical model of parallel computation , 1996, CACM.

[9] Lionel M. Ni,et al. The effects of network contention on processor allocation strategies , 1996, Proceedings of International Conference on Parallel Processing.

[10] Sarita V. Adve,et al. RSIM Reference Manual: Version 1.0 , 1997 .

[11] A. O. Fernandes,et al. Hardware-software codesign of embedded systems , 1998, Proceedings. XI Brazilian Symposium on Integrated Circuit Design (Cat. No.98EX216).

[12] Niraj K. Jha,et al. MOGAC: a multiobjective genetic algorithm for hardware-software cosynthesis of distributed embedded systems , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[13] Timothy Mark Pinkston,et al. Design issues for core-based optoelectronic chips: a case study of the WARRP network router , 1999 .

[14] Shietung Peng,et al. Wavelengths requirement for permutation routing in all-optical multistage interconnection networks , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[15] José Duato,et al. Characterization of communications between processes in message-passing applications , 2000, Proceedings IEEE International Conference on Cluster Computing. CLUSTER 2000.

[16] F. Silla,et al. A new task mapping technique for communication-aware scheduling strategies , 2001, Proceedings International Conference on Parallel Processing Workshops.

[17] W. Dally,et al. Route packets, not wires: on-chip interconnection networks , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[18] Jason Miller,et al. The Raw Processor: A Composeable 32-Bit Fabric for Embedded and General Purpose Computing , 2001 .

[19] Ruby B. Lee,et al. Efficient permutation instructions for fast software cryptography , 2001 .

[20] A design space evaluation of grid processor architectures , 2001, MICRO.

[21] Shubhendu S. Mukherjee,et al. The Alpha 21364 network architecture , 2001, HOT 9 Interconnects. Symposium on High Performance Interconnects.

[22] Timothy Mark Pinkston,et al. Characterization of Deadlocks in Irregular Networks , 2002, J. Parallel Distributed Comput..

[23] Scott Hauck,et al. Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[24] Radu Marculescu,et al. Exploiting the Routing Flexibility for Energy/Performance Aware Mapping of Regular NoC Architectures , 2003, DATE.

[25] Timothy Mark Pinkston,et al. A clustering approach for identifying and quantifying irregularities in interconnection networks , 2003, IEEE Trans. Parallel Distributed Syst..

[26] Jeffrey S. Vetter,et al. Communication characteristics of large-scale scientific applications for contemporary cluster architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[27] José Duato,et al. Deadlock-Free Dynamic Reconfiguration Schemes for Increased Network Dependability , 2003, IEEE Trans. Parallel Distributed Syst..

[28] Timothy Mark Pinkston,et al. A methodology for designing efficient on-chip interconnects on well-behaved communication patterns , 2003, The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings..

[29] Srinivasan Murali,et al. Bandwidth-constrained mapping of cores onto NoC architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.