A novel and efficient routing architecture for multi-FPGA systems

Multi-FPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture which is the manner in which wires, FPGAs and field-programmable interconnect devices (FPIDs) are connected. Several routing architectures for MFSs have been proposed, and previous research has shown that the partial crossbar is one of the best existing architectures. In this paper, we propose a new routing architecture, called the hybrid complete-graph and partial-crossbar (HCGP) which has superior speed and cost compared to a partial crossbar. The new architecture uses both hard-wired and programmable connections between the FPGAs. We compare the performance and cost of the HCGP and partial crossbar architectures experimentally, by mapping a set of 15 large benchmark circuits into each architecture. A customized set of partitioning and interchip routing tools were developed, with particular attention paid to architecture-appropriate interchip routing algorithms. We show that the cost of the partial crossbar (as measured by the number of pins on all FPGAs and FPIDs required to fit a design), is on average 20% more than the new HCGP architecture and as much as 25% more. Furthermore, the critical path delay for designs implemented on the partial crossbar were on average 20% more than the HCGP architecture and up to 43% more. Using our experimental approach, we also explore a key architecture parameter associated with the HCGP architecture-the proportion of hard-wired connections versus programmable connections-to determine its best value.

[1]  Charles M. Fiduccia,et al.  A linear-time heuristic for improving network partitions , 1988, 25 years of DAC.

[2]  Anant Agarwal,et al.  Logic emulation with virtual wires , 1997, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[3]  Pierre Marchal,et al.  Field-programmable gate arrays , 1999, CACM.

[4]  Joseph Varghese,et al.  An efficient logic emulation system , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[5]  Richard W. Conners,et al.  A MOdular and Reprogrammable Real-time Processing Hardware, MORRPH , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[6]  S. Casselman Virtual computing and the Virtual Computer , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.

[7]  R. M. Mattheyses,et al.  A Linear-Time Heuristic for Improving Network Partitions , 1982, 19th Design Automation Conference.

[8]  Jonathan Rose,et al.  A hybrid complete-graph partial-crossbar routing architecture for multi-FPGA systems , 1998, FPGA '98.

[9]  Joseph Varghese,et al.  An efficient logic emulation system , 1992, Proceedings 1992 IEEE International Conference on Computer Design: VLSI in Computers & Processors.

[10]  Richard W. Conners,et al.  MORRPH: a modular and reprogrammable real-time processing hardware , 1995, 1995 Proceedings of the IEEE International Symposium on Industrial Electronics.

[11]  Jonathan Rose,et al.  Experimental Ev alua-tion of Mesh and Partial Crossbar Routing Architec-tures for Multi-FPGA Systems , 1997 .

[12]  Martine D. F. Schlag,et al.  Architectural tradeoffs in field-programmable-device-based computing systems , 1993, [1993] Proceedings IEEE Workshop on FPGAs for Custom Computing Machines.

[13]  Jonathan Rose,et al.  The Transmogrifier-2: a 1 million gate rapid prototyping system , 1997, FPGA '97.

[14]  Jonathan Rose,et al.  The Effect of Fixed I/O Pin Positioning on The Routability and Speed of FPGAs , 1995 .

[15]  Jonathan Rose,et al.  Characterization and parameterized random generation of digital circuits , 1996, DAC '96.

[16]  Mark Shand,et al.  Programmable active memories: reconfigurable systems come of age , 1996, IEEE Trans. Very Large Scale Integr. Syst..

[17]  TingTing Hwang,et al.  Net assignment for the FPGA-based logic emulation system in the folded-Clos network structure , 1997, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[18]  Martin D. F. Wong,et al.  On Optimal Board-Level Routing for FPGA-based Logic Emulation , 1995, 32nd Design Automation Conference.

[19]  Ernest S. Kuh,et al.  Performance-driven system partitioning on multi-chip modules , 1992, [1992] Proceedings 29th ACM/IEEE Design Automation Conference.

[20]  Mohammed A. S. Khalid ROUTING ARCHITECTURE AND LAYOUT SYNTHESIS FOR MULTI-FPGA SYSTEMS , 1999 .

[21]  S. Yang,et al.  Logic Synthesis and Optimization Benchmarks User Guide Version 3.0 , 1991 .

[22]  Hyunchul Shin,et al.  A performance-driven logic emulation system: FPGA network design and performance-driven partitioning , 1996, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[23]  Carl Ebeling,et al.  Mesh routing topologies for multi-FPGA systems , 1998, IEEE Trans. Very Large Scale Integr. Syst..

[24]  Martin D. F. Wong,et al.  Board-level multi-terminal net routing for FPGA-based logic emulation , 1995, ICCAD.

[25]  Norman P. Jouppi,et al.  Timing Analysis and Performance Improvement of MOS VLSI Designs , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[26]  Martin D. F. Wong,et al.  Board-level multiterminal net routing for FPGA-based logic emulation , 1997, TODE.

[27]  David R. Galloway The Transmogrifier C hardware description language and compiler for FPGAs , 1995, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[28]  David E. van den Bout,et al.  AnyBoard: an FPGA-based, reconfigurable system , 1992, IEEE Design & Test of Computers.