Implementation of RSA Algorithm Based on RNS Montgomery Multiplication

We proposed a fast parallel algorithm of Montgomery multiplication based on Residue Number Systems (RNS). An implementation of RSA cryptosystem using the RNS Montgomery multiplication is described in this paper. We discuss how to choose the base size of RNS and the number of parallel processing units. An implementation method using the Chinese Remainder Theorem (CRT) is also presented. An LSI prototype adopting the proposed Cox-Rower Architecture achieves 1024- bit RSA transactions in 4.2 msec without CRT and 2.4 msec with CRT, when the operating frequency is 80 MHz and the total number of logic gates is 333 KG for 11 parallel processing units.

[1]  Pascal Paillier Low-Cost Double-Size Modular Exponentiation or How to Stretch Your Cryptoprocessor , 1999, Public Key Cryptography.

[2]  C. J. Mitchell PRIMALITY AND CRYPTOGRAPHY (Wiley‐Teubner Series in Computer Science) , 1987 .

[3]  Atsushi Shimbo,et al.  Cox-Rower Architecture for Fast Parallel Montgomery Multiplication , 2000, EUROCRYPT.

[4]  Jean-Claude Bajard,et al.  An RNS Montgomery Modular Multiplication Algorithm , 1998, IEEE Trans. Computers.

[5]  Evangelos Kranakis Primality and cryptography , 1986, Wiley-Teubner series in computer science.

[6]  Reinhard Posch,et al.  Modulo Reduction in Residue Number Systems , 1995, IEEE Trans. Parallel Distributed Syst..

[7]  Jean-Claude Bajard,et al.  An RNS Montgomery modular multiplication algorithm , 1997, Proceedings 13th IEEE Sympsoium on Computer Arithmetic.