A quantitative comparison of reconfigurable, tiled, and conventional architectures on bit-level computation

General purpose computing architectures are being called on to work on a more diverse application mix every day. This has been fueled by the need for reduced time to market and economies of scale that are the hallmarks of software on general purpose microprocessors. As this application mix expands, application domains such as bit-level computation, which has primarily been the domain of ASICs and FPGAs, and need to be effectively handled by general purpose hardware. Examples of bit-level applications include Ethernet framing, forward error correction encoding/decoding, and efficient state machine implementation. In this work we compare how differing computational structures such as ASICs, FPGAs, tiled architectures, and superscalar microprocessors are able to compete on bit-level communication applications. A quantitative comparison in terms of absolute performance and performance per area is presented. These results show that although modest gains (2-3x) in absolute performance can be achieved when using FPGAs versus tuned microprocessor implementations, it is the significantly larger gains (2-3 orders of magnitude) that can be achieved in performance per area that motivates work on supporting bit-level computation in a general purpose fashion in the future.

[1]  André DeHon,et al.  MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources , 1996, 1996 Proceedings IEEE Symposium on FPGAs for Custom Computing Machines.

[2]  George Varghese,et al.  HSRA: high-speed, hierarchical synchronous reconfigurable array , 1999, FPGA '99.

[3]  Scott Hauck,et al.  The Chimaera reconfigurable functional unit , 1997, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[4]  William J. Dally,et al.  Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.

[5]  Duncan A. Buell,et al.  Splash 2 - FPGAs in a custom computing machine , 1996 .

[6]  Victor Lee,et al.  The RAW benchmark suite: computation structures for general purpose computing , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[7]  Seth Copen Goldstein,et al.  PipeRench: a co/processor for streaming multimedia acceleration , 1999, ISCA.

[8]  Karthikeyan Sankaralingam,et al.  A design space evaluation of grid processor architectures , 2001, MICRO.

[9]  Carl Ebeling,et al.  Mapping applications to the RaPiD configurable architecture , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).

[10]  Henry Hoffmann,et al.  The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.

[11]  Michael D. Smith,et al.  A high-performance microarchitecture with hardware-programmable functional units , 1994, Proceedings of MICRO-27. The 27th Annual IEEE/ACM International Symposium on Microarchitecture.

[12]  Peter A. Franaszek,et al.  A DC-Balanced, Partitioned-Block, 8B/10B Transmission Code , 1983, IBM J. Res. Dev..

[13]  John Wawrzynek,et al.  The Garp Architecture and C Compiler , 2000, Computer.

[14]  Michael Bolotski Andr,et al.  Unifying FPGAs and SIMD Arrays , 1994 .

[15]  John Wawrzynek,et al.  Adapting software pipelining for reconfigurable computing , 2000, CASES '00.

[16]  Daniel P. Lopresti,et al.  Building and using a highly parallel programmable logic array , 1991, Computer.