Recursive least-squares using a hybrid Householder algorithm on massively parallel SIMD systems

Abstract Within the context of recursive least-squares, the implementation of a Householder algorithm for block updating the QR decomposition, on massively parallel SIMD systems, is considered. Initially, two implementations based on different mapping strategies for distributing the data matrices over the processing elements of the parallel computer are investigated. Timing models show that neither of these implementations is superior in all cases. In order to increase computational speed, a hybrid implementation uses performance models to partition the problem into two subproblems which are then solved using the first and second implementation, respectively.