A Fast Parallel Implementation of the Berlekamp-Massey Algorithm with a One-D Systolic Array Architecture

In this paper we present a fast parallel version of the BM algorithm based on a one-dimensional (1D) or linear systolic array architecture which is composed of a series of m cells (processing units), where m is the size of the given data, i.e., the length of the input sequence. The 1D systolic array has only local communication links between each two neighboring cells without any global or nonlocal links between distant cells. Each cell executes a small fixed number of operations at every time unit. Our implementation with the 1D systolic array architecture attains time complexity \(\mathcal{O}\left( m \right)\) so that we can have the optimal total complexity \(\mathcal{O}\left( {m^2 } \right)\), which means that both requirements of (1) maximum throughput rate and of (2) local communication are satisfied, as is the case with some fast parallel implementations of the extended Euclidean algorithm. Our method gives not only another proof of equivalence between the Berlekamp-Massey algorithm and the extended Euclidean algorithm, in particular in the realm of parallel processing, but also alternatives of practical and efficient architectures for R.S. decoders.