Asynchronous implementation of 1024-bit modular processor for RSA cryptosystem

In this paper, an implementation method to optimize a 1024-bit RSA processor is presented. Basically, the Montgomery algorithm is used and modified considering the large bit modular multiplication. We propose a new architecture for 1024-bit RSA processing in order to reduce the required hardware resources. The new architecture is also fit for an effective I/O interface. We have implemented a single-chip 1024-bit RSA processor based on the modified algorithm and architecture with 0.65-/spl mu/m SOG technology using Verilog HDL. As a result, it is shown that the processor can perform 1024-bit RSA operation in less than 43 ms at 50 MHz.