The Design of Optimal Systolic Arrays

Conventional design of systolic arrays is based on the mapping of an algorithm onto an interconnection of processing elements in a VLSI chip. This mapping is done in an ad hoc manner, and the resulting configuration usually represents a feasible but suboptimal design. In this paper, systolic arrays are characterized by three classes of parameters: the velocities of data flows, the spatial distributions of data, and the periods of computation. By relating these parameters in constraint equations that govern the correctness of the design, the design is formulated into an optimization problem. The size of the search space is a polynomial of the problem size, and a methodology to systematically search and reduce this space and to obtain the optimal design is proposed. Some examples of applying the method, including matrix multiplication, finite impulse response filtering, deconvolution, and triangular-matrix inversion, are given.

[1]  Dennis Gannon Pipelining array computations for MIMD parallelism: a function specification , 1982, ICPP.

[2]  Patrice Quinton Automatic synthesis of systolic arrays from uniform recurrent equations , 1984, ISCA '84.

[3]  Robert Henry Kuhn,et al.  Optimization and interconnection complexity for: parallel processors, single-stage networks, and decision trees , 1980 .

[4]  Allan L. Fisher Systolic Algorithms for Running Order Statistics in Signal and Image Processing , 1981 .

[5]  Lawrence Snyder,et al.  Introduction to the configurable, highly parallel computer , 1982, Computer.

[6]  H. Kung,et al.  An algebra for VLSI algorithm design , 1983 .

[7]  Peter R. Cappello,et al.  Unifying VLSI Array Designs with Geometric Transformations , 1983, International Conference on Parallel Processing.

[8]  Dan I. Moldovan,et al.  On the Analysis and Synthesis of VLSI Algorithms , 1982, IEEE Transactions on Computers.

[9]  Thomas Kailath,et al.  Design framework for systolic-type arrays , 1984, ICASSP.

[10]  Sun-Yuan Kung,et al.  On supercomputing with systolic/wavefront array processors , 1984 .

[11]  Kai Hwang,et al.  Partitioned Matrix Algorithms for VLSI Arithmetic Systems , 1982, IEEE Trans. Computers.

[12]  H. T. Kung Why systolic architectures? , 1982, Computer.

[13]  Benjamin W. Wah,et al.  Systematic approaches to the design of algorithmically specified systolic arrays , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  I. V. Ramakrishnan,et al.  Mapping Homogeneous Graphs on Linear Arrays , 1986, IEEE Transactions on Computers.

[15]  Wei-Chung Lin,et al.  Space-Time Domain Expansion Approach to VLSI and Its Application to Hierarchical Scene Matching , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Jose Antonio Baptista Fortes Algorithm transformations for parallel processing and vlsi architecture design , 1984 .

[17]  Carver A. Mead,et al.  Concurrent Algorithms as Space-Time Recursion Equations , 1983 .

[18]  Danny Cohen,et al.  A mathematical approach to modelling the flow of data and control in computational networks , 1981 .

[19]  Thomas P. Barnwell,et al.  A graph theoretic technique for the generation of systolic implementations for shift-invariant flow graphs , 1984, ICASSP.

[20]  Monica S. Lam,et al.  A Transformational Model of VLSI Systolic Design , 1985, Computer.

[21]  Charles E. Leiserson,et al.  Optimizing Synchronous Circuitry by Retiming (Preliminary Version) , 1983 .

[22]  Al Davis,et al.  A Wavefront Notation Tool for VLSI Array Design , 1981 .

[23]  D. V. Bhaskar Rao,et al.  Wavefront Array Processor: Language, Architecture, and Applications , 1982, IEEE Transactions on Computers.

[24]  I. V. Ramakrishnan,et al.  On Mapping Homogeneous Graphs on a Linear Array-Processor Model , 1983, ICPP.

[25]  Kai Hwang,et al.  Partitioned Matrix Algorithms for VLSI Arithmetic Systems , 1982, IEEE Transactions on Computers.

[26]  H. T. Kung,et al.  A Two-Level Pipelined Systolic Array for Convolutions , 1981 .