Application-Specific Hardware: Computing Without CPUs

In this paper we propose a new architecture for generalpurpose computing which combines a reconfigurablehardware substrate and compiler technology to generate Application-Specific Hardware (ASH). The novelty of this architecture is that resources are not shared: each different static program instruction can have its own dedicated hardware implementation. ASH enables the synthesis of circuits with only local computation structures, which promise to be fast, inexpensive and use very little power. This paper also presents a scalable compiler framework for ASH, which generates hardware from programs written in C and some evaluations of the resources necessary for implementing realistic programs.

[1]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[2]  Arthur H. Veen,et al.  Dataflow machine architecture , 1986, CSUR.

[3]  Georg Sander VCG - visualization of compiler graphs , 1995 .

[4]  K. R. Traub,et al.  A COMPILER FOR THE MIT TAGGED-TOKEN DATAFLOW ARCHITECTURE , 1986 .

[5]  Seth Copen Goldstein,et al.  TAM - A Compiler Controlled Threaded Abstract Machine , 1993, J. Parallel Distributed Comput..

[6]  Ken Mai,et al.  The future of wires , 2001, Proc. IEEE.

[7]  Mark Stephenson,et al.  Bidwidth analysis with application to silicon compilation , 2000, PLDI '00.

[8]  Gregory M. Papadopoulos,et al.  Implementation of a general purpose dataflow multiprocessor , 1991 .

[9]  Steven K Heller,et al.  Efficient Lazy Data-Structures on a Dataflow Machine , 1989 .

[10]  Scott A. Mahlke,et al.  Effective compiler support for predicated execution using the hyperblock , 1992, MICRO 25.

[11]  Csaba Andras Moritz,et al.  Parallelizing applications into silicon , 1999, Seventh Annual IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00375).

[12]  Michael D. Ernst,et al.  Value dependence graphs: representation without taxation , 1994, POPL '94.

[13]  Gregory S. Snider,et al.  A Defect-Tolerant Computer Architecture: Opportunities for Nanotechnology , 1998 .

[14]  Vikas Agarwal,et al.  Clock rate versus IPC: the end of the road for conventional microarchitectures , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[15]  Steven W. K. Tjiang,et al.  SUIF: an infrastructure for research on parallelizing and optimizing compilers , 1994, SIGP.

[16]  George C. Necula,et al.  Translation validation for an optimizing compiler , 2000, PLDI '00.

[17]  Yanbing Li,et al.  Hardware-software co-design of embedded reconfigurable architectures , 2000, DAC.

[18]  John Wawrzynek,et al.  Adapting software pipelining for reconfigurable computing , 2000, CASES '00.

[19]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[20]  Mark N. Wegman,et al.  Efficiently computing static single assignment form and the control dependence graph , 1991, TOPL.

[21]  David A. Padua,et al.  Efficient building and placing of gating functions , 1995, PLDI '95.

[22]  Rahul Razdan,et al.  PRISC: programmable reduced instruction set computers , 1994 .

[23]  Scott Mahlke,et al.  Bitwidth Sensitive Code Generation in a Custom Embedded Accelerator Design System , 2001 .

[24]  Scott A. Mahlke,et al.  Integrated predicated and speculative execution in the IMPACT EPIC architecture , 1998, ISCA.

[25]  Seth Copen Goldstein,et al.  BitValue Inference: Detecting and Exploiting Narrow Bitwidth Computations , 2000, Euro-Par.

[26]  William J. Dally,et al.  Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.

[27]  Keshav Pingali,et al.  Dependence flow graphs: an algebraic approach to program dependencies , 1991, POPL '91.

[28]  Karthikeyan Sankaralingam,et al.  A Technology-Scalable Architecture for Fast Clocks and High ILP , 2001 .

[29]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[30]  John Wawrzynek,et al.  Instruction-Level Parallelism for Reconfigurable Computing , 1998, FPL.

[31]  Scott A. Mahlke,et al.  A framework for balancing control flow and predication , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.