Parallel Modular Multiplication on Multi-core Processors

Current processors typically embed many cores running at high speed. The main goal of this paper is to assess the efficiency of software parallelism for low level arithmetic operations by providing a thorough comparison of several parallel modular multiplications. Famous methods such as Barrett, Montgomery as well as more recent algorithms are compared together with a novel k-ary multipartite multiplication which allows to split the computations into independent processes. Our experiments show that this new algorithm is well suited to software parallelism.