Systolic arrays for LU decomposition

The author presents a number of systolic arrays for decomposing a matrix into its lower and upper triangular factor (LU-decomposition). These architectures have been formally derived using techniques for synthesizing systolic arrays from affine recurrence equations, and the entire design process can be automated. The initial specification is a high-level one similar to a nested loop program, and a technique called explicit pipelining is used to automatically localize the data dependencies. The architectures presented have interesting features such as control signals, and specialized behavior of certain processors (such as boundary processors). These characteristics, as well as processor initialization signals, can be derived automatically.<<ETX>>