Multiprocessor implementation of recursive least squares algorithms using a parallel block processing method

An efficient real-time implementation of recursive least squares (RLS) algorithms using a multiprocessor system in a ring network is investigated. This method is based on a parallel block processing approach, where a stream of input data is divided into blocks and a block is assigned to each processor in rotation. To solve the dependency problems between the processors, the decomposition and the look-ahead methods are utilized. The former is used for algorithms which have decomposable structures, such as the QR algorithms based on the Givens rotation. The latter is applied to the implementation of undecomposable algorithms including the direct correlation methods and the QR algorithms based on the Householder transformation. The performance of the system, such as the maximum throughput and the efficiency, is quantitatively analyzed.<<ETX>>