Using warp as a supercomputer in signal processing

Warp is a programmable systolic array machine designed by CMU and built together with its industrial partners-GE and Honeywell. The first large scale version of the machine with an array of 10 linearly connected cells will become operational in January 1986. Each cell in the array is capable of performing 10 million 32-bit floating-point operations per second (10 MFLOPS). The 10-cell array can achieve a performance of 50 to 100 MFLOPS for a large variety of signal processing operations such as digital filtering, image compression, and spectral decomposition. The machine, augmented by a Boundary Processor, is particularly effective for computationally expensive matrix algorithms such as solution of linear systems, QR-decomposition and singular value decomposition, that are crucial to many real-time signal processing tasks. This paper outlines the Warp implementation of the 2- dimensional Discrete Cosine Transform and singular value decomposition.

[1]  H. T. Kung,et al.  Extending the CMU Warp Machine with a Boundary Processor , 1986, Optics & Photonics.

[2]  Takeo Kanade,et al.  First Results in Robot Road-Following , 1985, IJCAI.

[3]  H. T. Kung Systolic algorithms for the CMU warp processor , 1984 .

[4]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[5]  H. T. Kung,et al.  Matrix Triangularization By Systolic Arrays , 1982, Optics & Photonics.

[6]  H. T. Kung,et al.  A systolic array computer , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Franklin T. Luk,et al.  A new systolic array for the singular value decomposition , 1986 .

[8]  M. Hestenes Inversion of Matrices by Biorthogonalization and Related Results , 1958 .

[9]  H. T. Kung,et al.  Warp as a machine for low-level vision , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[10]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..