On Simplifying Placement and Routing by Extending Coarse-Grained Reconfigurable Arrays with Omega Networks

Most reconfigurable computing architectures suffer from computationally demanding Placement and Routing (P&R) steps which might hamper their use in contexts requiring dynamic compilation (e.g., to guarantee application portability in embedded systems). Bearing in mind the simplification of P&R steps, this paper presents and analyzes a coarse-grained reconfigurable array extended with global Omega Networks. We show that integrating one or two Omega Networks in a coarse-grained array simplifies the P&R stage with both low hardware resource overhead and low performance degradation (18% for an 8×8 array). The experimental results included permit to compare the coarse-grained array with one or two Omega Networks with a coarse-grained array based on a grid of processing elements with neighbor connections. When comparing the execution time to perform the P&R stage needed for the two arrays, we show that the array using two Omega Networks needs a far simple P&R which for the benchmarks used completed on average in about 20× less time.

[1]  Alex K. Jones,et al.  Interconnect Customization for a Coarse-grained Reconfigurable Fabric , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[2]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[3]  Tse-Yun Feng,et al.  On a Class of Rearrangeable Networks , 1992, IEEE Trans. Computers.

[4]  Scott Hauck,et al.  Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation , 2007 .

[5]  Russell Tessier,et al.  Fast place and route approaches for fpgas , 1999 .

[6]  Rudy Lauwereins,et al.  Exploiting Loop-Level Parallelism on Coarse-Grained Reconfigurable Architectures Using Modulo Scheduling , 2003, DATE.

[7]  William J. Dally,et al.  Flattened butterfly: a cost-efficient topology for high-radix networks , 2007, ISCA '07.

[8]  John Wawrzynek,et al.  Hardware-assisted fast routing , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[9]  Reiner W. Hartenstein,et al.  A decade of reconfigurable computing: a visionary retrospective , 2001, Proceedings Design, Automation and Test in Europe. Conference and Exhibition 2001.

[10]  João M. P. Cardoso,et al.  A Polynomial Placement Algorithm for Data Driven Coarse-Grained Reconfigurable Architectures , 2007, IEEE Computer Society Annual Symposium on VLSI (ISVLSI '07).

[11]  Rudy Lauwereins,et al.  Architecture exploration for a reconfigurable architecture template , 2005, IEEE Design & Test of Computers.

[12]  Markus Weinhardt,et al.  PACT XPP—A Self-Reconfigurable Data Processing Architecture , 2003, The Journal of Supercomputing.

[13]  Seth Copen Goldstein,et al.  PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.

[14]  Kyungsook Y. Lee,et al.  A New Benes Network Control Algorithm , 1987, IEEE Trans. Computers.

[15]  Weifa Liang,et al.  Optimally Routing LC Permutations on k-Extra-Stage Cube-Type Networks , 1996, IEEE Trans. Computers.

[16]  Frank Vahid,et al.  Dynamic FPGA routing for just-in-time FPGA compilation , 2004, Proceedings. 41st Design Automation Conference, 2004..

[17]  Stephen Gilmore,et al.  A design environment for mobile applications , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[18]  Rajesh Gupta,et al.  Network topology exploration of mesh-based coarse-grain reconfigurable architectures , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[19]  V. Benes,et al.  Mathematical Theory of Connecting Networks and Telephone Traffic. , 1966 .

[20]  Marrakchi Zied,et al.  Efficient tree topology for FPGA interconnect network , 2008, GLSVLSI '08.

[21]  S. Andresen The Looping Algorithm Extended to Base 2tRearrangeable Switching Networks , 1977, IEEE Trans. Commun..