High throughput VLSI implementation of discrete orthogonal transforms using bit-level vector-matrix multiplier

In this paper, we propose a fully pipelined two-dimensional (2-D) bit level systolic architecture for efficient implementation of discrete orthogonal transforms using a serial-parallel vector-matrix multiplication scheme based on the Baugh-Wooley algorithm. Apart from its regularity and simplicity, the proposed structure yields high throughput due to massive parallelism across the 2-D mesh. The area- and time-complexities of the proposed structure are (ON/sup 2/) and O(2nN/sup 2/), respectively, for implementation of N-point transform, where n is the wordlength.

[1]  Chaitali Chakrabarti,et al.  Systolic Architectures for the Computation of the Discrete Hartley and the Discrete Cosine Transforms Based on Prime Factor Decomposition , 1990, IEEE Trans. Computers.

[2]  T. Parks,et al.  A prime factor FFT algorithm using high-speed convolution , 1977 .

[3]  Chein-Wei Jen,et al.  A novel CORDIC-based array architecture for the multidimensional discrete Hartley transform , 1995 .

[4]  R. Clarke,et al.  Relation between the Karhunen Loève and cosine transforms , 1981 .

[5]  Bruce A. Wooley,et al.  A Two's Complement Parallel Array Multiplication Algorithm , 1973, IEEE Transactions on Computers.

[6]  Harvey F. Silverman,et al.  An introduction to programming the Winograd Fourier transform algorithm (WFTA) , 1977 .

[7]  Douglas L. Jones,et al.  On computing the discrete Hartley transform , 1985, IEEE Trans. Acoust. Speech Signal Process..

[8]  Sun-Yuan Kung,et al.  The use of data dependence graphs in the design of bit-level systolic arrays , 1990, IEEE Trans. Acoust. Speech Signal Process..

[9]  Ganapati Panda,et al.  Efficient systolic solution for a new prime factor discrete Hartley transform algorithm , 1993 .

[10]  J. A. Eldon,et al.  Fourier On A Chip - Perfoyowce And Applications , 1988, Twenty-Second Asilomar Conference on Signals, Systems and Computers.

[11]  Ganapati Panda,et al.  New high-speed prime-factor algorithm for discrete Hartley transform , 1993 .

[12]  Ting Chen,et al.  VLSI implementation of a 16*16 discrete cosine transform , 1989 .

[13]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[14]  Robert Michael Owens,et al.  A VLSI chip for the winograd/Prime factor algorithm to compute the discrete Fourier transform , 1986, IEEE Trans. Acoust. Speech Signal Process..

[15]  R.F. Woods,et al.  Novel VLSI implementation of (8×8) point 2-D DCT , 1994 .

[16]  PeiZong Lee,et al.  An efficient prime-factor algorithm for the discrete cosine transform and its hardware implementations , 1994, IEEE Trans. Signal Process..

[17]  H. T. Kung Why systolic architectures? , 1982, Computer.