A systolic array architecture for multiplying Toeplitz matrices

We demonstrate a systolic array architecture for multiplying two Toeplitz matrices. The regular systolic matrix multiplication algorithm requires O(nz) area. Intuitively, since there are only O(n) unique elements in a n x n Toeplitz matrix, we expect a solution with O(n) area, which is described in this paper. Introduction A systolic system is a network of processors which rhythmically compute and pass data through the system [4]. A systolic array design differs from the conventional Von Neumann machine in its highly pipelined computation. More precisely, once a data item is fetched from memory, it can be used effectively at each cell it pazses while being pumped from cell to cell along the array. This is especially suited for a wide class of compute-bound computations, where multiple operations are performed on each data item in a repetitive manner. This avoids the classic memory access bottleneck problem commonly incurred in Von Neumann machines. The major ressons for adopting systolic arrays for special-purpose processing architectures are briefly discussed now [3]. Systolic arrays exhibit simple and regular design, which leads to great savings in VLSI (Very Large Scale Integration) design costs. They use concurrency for speed-up, but concurrency leads to significant communication costs for large number of processors. In VLSI, routing costs dominate the power, time and area required to implement a computation [5]. Systolic arrays use regular and local communication which reduces the routing costs. Systolic arrays strike a good balance between computation and input-output (1/0), using a limited number of 1/0 pads (pins), while operating at Permission to copy without fee all or part of this matarial ia grantad providad that tha copiaa ara not mada or distributed for diract commercial edvantaga, the ACM copyright notica and the title of tha publication and ita date appear, end notica is given that copying ia by parmieaion of tha Aaaociation for Computing Machinery. To copy otherwiaa, or to rapubliah, raquires a fea and/or specific permiaaion. 01992 ACM &89791 .502. X/92/W02/0933... $l” .S0 roughly the same speed as more obvious algorithms that require many more input pads [6, Chap. 5]. A Toeplitz matrix is a n x n matrix such that aij = ai-l,j-l, 2 s i, j s n, i.e., elements along each diagonal are all the same. Toeplitz matricarise in many areas like time series analysis, image processing, control theory, statistics, integral equations, etc [2]. For real-time signal and image processing, we would need operations on Toeplitz matrices to occur in synchronism with the data, and this motivates the attempt to build specialpurpoee architectures for multiplying such matricee. Representing Toeplitz matrices Two 4 x 4 Toeplitz matricea A and 13 are shown below.