Systolic Givens factorization of dense rectangular matrices

Given an m by n dense matrix A(m≧n) we consider parallel algorithms to compute its orthogonal factorization via Givens rotations. First we describe an algorithm which is executed in m+n— 2 steps on a linear array of [m/2] processors, a step being the time necessary to achieve a Givens rotation. The pipelined version of the new algorithm leads to a systolic implementation whose area-time performances overcome those of the arrays of Bojanczyk, Brent and Kung [1] and Gentleman and Kung [5].