MULTIDIMENSIONAL SYSTOLIC ARRAYS FOR DSP APPLICATIONS

This paper presents an algorithm transformation technique to transform suitable DSP algorithms for multidi- mensional systolic array implementation. The aim of such tranformation is to speed up computation without much in- crease in area requirement. The application of the technique on some DSP algorithms are presented. The systolic net- works produced are then implemented using NORA CMOS logic structure and laid out using 3pm CMOS p-well technol- ogy. Areas and times of the resulting architectures are then evaluated and discussed. Localized operations, intensive computation, and matrix operations are features of many algorithms used in signal and image processing (KungS88). Many of these algorithms are also locally recursive. The computation features of these algorithms can be exploited to facilitate the design of special-purpose signal/image processing array processors. Therefore there has been considerable efforts in the develop- ment of systematic and efficient methods for transforming and mapping such algorithms onto systolic array structures. A brief review can be found in (LingSSa). Algorithm transformation and mapping is a technique to transform a given algorithm to the desired form and to generate suitable systolic array to implement it. With the rising demand for high-speed computation in digital signal processing (DSP) applications, the need to speed up algorithm computation has increased. The successes and benefits of 3-D VLSI has given the possibility to produce arrays with fast computation time without much increase in area (ccst) requirement by using the third dimension (Tera87,Inou86). Examples of throughput improvements by using 3-D systolic arrays include (Lind84) and (Pann85). In this paper, an attempt is made to increase the dimension of the index-space for suitable signal processing algorithms in a systematie'way in order to achieve higher parallelism without increasing area complexity. The resulting algorithms can then be mapped onto multidimensional systolic array networks and can be implemented by 2-D or 3-D VLSI. By doing so, the compu- tation time and its order of complexity can be significantly improved while keeping the the number of processing cells constant. The price to be paid is the small amount of addi- tional circuitry (usually in the form of adders and intercon- nection wires) required for inter-row or inter-plane communi- cations.

[1]  H. T. Kung Special-Purpose Devices For Signal And Image Processing: An Opportunity In Very Large Scale Integration (VLSI) , 1980, Optics & Photonics.

[2]  H. T. Kung Why systolic architectures? , 1982, Computer.

[3]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[4]  Y. Horiba,et al.  A Three Dimensional Static RAM , 1985, 1985 Symposium on VLSI Technology. Digest of Technical Papers.

[5]  N. F. Goncalves,et al.  NORA: a racefree dynamic CMOS technique for pipelined logic structures , 1983 .