An integral matrix-based technique for systematic systolic design

In this paper we consider a systematic mapping procedure for systolic arrays. Integral matrix theory provides the basic concepts used here to define projection and scheduling vectors. Unimodular matrices are defined which describe projection, timing, and bases for the processor space and a correct timing function. The use of these matrices couples the definition of correct projection and scheduling functions and provides relatively simple tools for the design. The same mathematical description furnishes a rigorous definition of the partitioning block structure, as well as the cluster set. Both partitioning schemes of locally parallel, globally sequential (LPGS) and locally sequential, globally parallel (LSGP), as well as a number of intermediate partitioning schemes, can be generated by using this technique. Folding (intended as spatial relocation of portions of processing elements) as well as a number of design constraints can also be included, and are briefly considered here. The possible application of a systolic design for low power requirements is also discussed.

[1]  Ilse C. F. Ipsen,et al.  Design Methodology For Systolic Arrays , 1986, Optics & Photonics.

[2]  Flavio Lorenzelli,et al.  Systolic design with partitioning and computationally intensive algorithms for signal processing , 1993 .

[3]  E. F. Deprettere,et al.  Cellular Broadcast In Regular Processor Arrays , 1992, Workshop on VLSI Signal Processing.

[4]  Lothar Thiele,et al.  A transformative approach to the partitioning of processor arrays , 1992, [1992] Proceedings of the International Conference on Application Specific Array Processors.

[5]  LYONJean-Marc DelosmeYale UniversityDepartment Partitioning for array processors , 1990 .

[6]  Anantha P. Chandrakasan,et al.  Low-power CMOS digital design , 1992 .

[7]  Mateo Valero,et al.  Partitioning: An Essential Step in Mapping Algorithms Into Systolic Array Processors , 1987, Computer.

[8]  Dimitrios Soudris,et al.  Direct mapping of nested loops on piecewise regular processor arrays , 1991, Algorithms and Parallel VLSI Architectures.

[9]  C.M. Rader,et al.  MUSE-a systolic array for adaptive nulling with 64 degrees of freedom, using Givens transformations and wafer scale integration , 1990, [1992] Proceedings of the International Conference on Application Specific Array Processors.

[10]  Marc Moonen,et al.  Algorithms and parallel VLSI architectures , 1995, Integr..

[11]  H. De Man,et al.  Global communication and memory optimizing transformations for low power signal processing systems , 1994, Proceedings of 1994 IEEE Workshop on VLSI Signal Processing.

[12]  Dan I. Moldovan,et al.  Partitioning and Mapping Algorithms into Fixed Size Systolic Arrays , 1986, IEEE Transactions on Computers.

[13]  Hugo De Man,et al.  Signal analysis and signal transformations for ASIC regular array architecture synthesis , 1992, Algorithms and Parallel VLSI Architectures.

[14]  H.J. De Man,et al.  Automating High Level Control F'low Transformations For Dsp Memory Management , 1992, Workshop on VLSI Signal Processing.

[15]  Hugo De Man,et al.  Array design methodologies for real-time signal processing in the CATHEDRAL-IV synthesis environment , 1992, Algorithms and Parallel VLSI Architectures.

[16]  Tomás Lang,et al.  Graph-based Partitioning of Matrix Algorithms for Systolic Arrays: Application to Transitive Closure , 1988, ICPP.

[17]  Thomas Kailath,et al.  Regular iterative algorithms and their implementation on processor arrays , 1988, Proc. IEEE.

[18]  K. Jainandunsing,et al.  Parallel algorithms for solving systems of linear equations and their mapping on systolic arrays , 1989 .

[19]  Christian Choffrut,et al.  Folding of the Plane and the Design of Systolic Arrays , 1983, Inf. Process. Lett..

[20]  J. Bu,et al.  Systematic design of regular VLSI processor arrays , 1990 .

[21]  H. De Man,et al.  The Exploitation Of Global Operations In Affine Space-time Mapping , 1992, Workshop on VLSI Signal Processing.

[22]  Jürgen Teich,et al.  Partitioning of processor arrays: a piecewise regular approach , 1993, Integr..

[23]  Patrice Quinton,et al.  Algorithms and Parallel VLSI Architectures , 1992, Algorithms and Parallel VLSI Architectures.

[24]  Ed F. Deprettere,et al.  Processor clustering for the design of optimal fixed-size systolic arrays , 1991, Proceedings of the International Conference on Application Specific Array Processors.

[25]  Ed F. Deprettere,et al.  MODEL AND METHODS FOR REGULAR ARRAY DESIGN , 1993 .

[26]  Hugo De Man,et al.  Nonlinear transformations for high level regular array ASIC synthesis , 1992, J. VLSI Signal Process..

[27]  Peter R. Cappello,et al.  Converting affine recurrence equations to quasi-uniform recurrence equations , 1995, J. VLSI Signal Process..

[28]  Kai Hwang,et al.  Partitioned Matrix Algorithms for VLSI Arithmetic Systems , 1982, IEEE Transactions on Computers.

[29]  F. Lorenzelli,et al.  A systematic partitioning approach for LS and SVD problems to fixed size arrays with constraints , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  Ed F. Deprettere,et al.  A design methodology for fixed-size systolic arrays , 1990, [1990] Proceedings of the International Conference on Application Specific Array Processors.

[31]  Jean-Marc Delosme,et al.  Transformation of Broadcass into Propagations in Systolic Algorithms , 1992, J. Parallel Distributed Comput..

[32]  E. Deprettere,et al.  Automatic design and partitioning of systolic/wavefront arrays for VLSI , 1988 .

[33]  Hugo De Man,et al.  Modeling multidimensional data and control flow , 1993, IEEE Trans. Very Large Scale Integr. Syst..

[34]  Weiping Li,et al.  VLSI Signal Processing , 1995 .