Three-dimensional computational pipelining with minimal latency and maximum throughput for L-U factorization

A three-dimensional (3-D) wavefront array with minimal computation time (latency) of 2n-2 cycles for an n*n matrix and minimal block pipelining period of one is introduced and compared to existing two-dimensional (2-D) systolic array architectures for L-U factorization. An optimal processor-time product of (1/3)n/sup 3/ with cycles defined computationally by two operations is obtained when successive problem instances are considered. The 3-D architecture is extensible and scalable, is cycle invariant (all respects), has minimal node complexity of two arithmetic operations per cycle, has unidirectional data forwarding in three dimensions, has 100% utilization of processing elements for successive inputs, and has a cycle-invariant one-to-one correspondence between input/output ports and input/output matrix elements.

[1]  D. V. Bhaskar Rao,et al.  Wavefront Array Processor: Language, Architecture, and Applications , 1982, IEEE Transactions on Computers.

[2]  H. T. Kung,et al.  Systolic Arrays for (VLSI). , 1978 .

[3]  Andrew Harter Three-dimensional integrated circuit layout , 1991, Distinguished dissertations in computer science.

[4]  Marlin H. Mickle,et al.  Three-dimensional computational wavefronts for matrix product , 1996 .

[5]  A. C. Harter Three-Dimensional Integrated Circuit Layout: References , 1991 .

[6]  G. Miel Trends in systolic and cellular computation , 1991 .

[7]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[8]  P. Crout,et al.  A short method for evaluating determinants and solving systems of linear equations with real or complex coefficients , 1941, Electrical Engineering.

[9]  H. T. Kung,et al.  Matrix Triangularization By Systolic Arrays , 1982, Optics & Photonics.

[10]  Yves Robert,et al.  Spacetime-minimal systolic arrays for Gaussian elimination and the algebraic path problem , 1990, Parallel Comput..

[11]  Jean-Marc Delosme,et al.  Highly concurrent computing structures for matrix arithmetic and signal processing , 1982, Computer.

[12]  Harold Stuart Stone High-performance computer architecture (2nd ed.) , 1990 .

[13]  Yves Robert,et al.  Spacetime-minimal systolic architectures for Gaussian elimination and the algebraic path problem , 1990, [1990] Proceedings of the International Conference on Application Specific Array Processors.

[14]  Marlin H. Mickle,et al.  PARALLEL COMPUTING AND THE SOLUTION OF Ax = b , 1979 .

[15]  Tadashi Ae,et al.  A Neural Network for 3-D VLSI Accelerator , 1989 .

[16]  Jenq-Neng Hwang,et al.  Wavefront Array Processors-Concept to Implementation , 1987, Computer.

[17]  Peter R. Cappello,et al.  A Processor-Time-Minimal Systolic Array for Cubical Mesh Algorithms , 1992, IEEE Trans. Parallel Distributed Syst..

[18]  Chris J. Scheiman,et al.  A Period-Processor-Time-Minimal Schedule for Cubical Mesh Algorithms , 1994, IEEE Trans. Parallel Distributed Syst..

[19]  Satoshi Fujita,et al.  A template matching algorithm using optically-connected 3-D VLSI architecture , 1987, ISCA '87.