Energy-Efficient High-Throughput Montgomery Modular Multipliers for RSA Cryptosystems

Modular exponentiation in the Rivest, Shamir, and Adleman cryptosystem is usually achieved by repeated modular multiplications on large integers. To speed up the encryption/decryption process, many high-speed Montgomery modular multiplication algorithms and hardware architectures employ carry-save addition to avoid the carry propagation at each addition operation of the add-shift loop. In this paper, we propose an energy-efficient algorithm and its corresponding architecture to not only reduce the energy consumption but also further enhance the throughput of Montgomery modular multipliers. The proposed architecture is capable of bypassing the superfluous carry-save addition and register write operations, leading to less energy consumption and higher throughput. In addition, we also modify the barrel register full adder (BRFA) so that the gated clock design technique can be applied to significantly reduce the energy consumption of storage elements in BRFA. Experimental results show that the proposed approaches can achieve up to 60% energy saving and 24.6% throughput improvement for 1024-bit Montgomery multiplier.

[1]  Lei Yang,et al.  An efficient CSA architecture for montgomery modular multiplication , 2007, Microprocess. Microsystems.

[2]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[3]  Y. Manoli,et al.  Complex clock gating with integrated clock gating logic cell , 2007, 2007 International Conference on Design & Technology of Integrated Systems in Nanoscale Era.

[4]  Jun Rim Choi,et al.  Asynchronous implementation of 1024-bit modular processor for RSA cryptosystem , 2000, Proceedings of Second IEEE Asia Pacific Conference on ASICs. AP-ASIC 2000 (Cat. No.00EX434).

[5]  Feng Gang Design of Modular Multiplier Based on Improved Montgomery Algorithm and Systolic Array , 2006, First International Multi-Symposiums on Computer and Computational Sciences (IMSCCS'06).

[6]  Koji Inoue,et al.  Multiplier energy reduction through bypassing of partial products , 2002, Asia-Pacific Conference on Circuits and Systems.

[7]  Wilson Vicente Ruggiero,et al.  A parallel k-partition method to perform Montgomery Multiplication , 2011, ASAP 2011 - 22nd IEEE International Conference on Application-specific Systems, Architectures and Processors.

[8]  Gokay Saldamli,et al.  Analyzing and comparing the Montgomery multiplication algorithms for their power consumption , 2010, The 2010 International Conference on Computer Engineering & Systems.

[9]  D.R. Sulaiman Using clock gating technique for energy reduction in portable computers , 2008, 2008 International Conference on Computer and Communication Engineering.

[10]  J. McCanny,et al.  Modified Montgomery modular multiplication and RSA exponentiation techniques , 2004 .

[11]  Viktor Bunimov,et al.  Complexity-Effective Version of Montgomery ’ s Algorihm , 2002 .

[12]  Alessandro Cilardo,et al.  Exploring the design-space for FPGA-based implementation of RSA , 2004, Microprocess. Microsystems.

[13]  Chun Zhang,et al.  Low-power implementations of DSP through operand isolation and clock gating , 2007, 2007 7th International Conference on ASIC.

[14]  Ming-Der Shieh,et al.  A New Algorithm for High-Speed Modular Multiplication Design , 2009, IEEE Transactions on Circuits and Systems I: Regular Papers.

[15]  Alessandro Cilardo,et al.  A novel unified architecture for public-key cryptography , 2005, Design, Automation and Test in Europe.

[16]  Manuel Valencia,et al.  High radix implementation of Montgomery multipliers with CSA , 2010, 2010 International Conference on Microelectronics.

[17]  M. Dousti,et al.  High Speed RSA Implementation Based on Modified Booth's Technique and Montgomery's Multiplication for FPGA Platform , 2009, 2009 Second International Conference on Advances in Circuits, Electronics and Micro-electronics.

[18]  Hu Zhengbing,et al.  An Efficient Architecture of 1024-bits Cryptoprocessor for RSA Cryptosystem Based on Modified Montgomery's Algorithm , 2007, 2007 4th IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications.

[19]  Ming-Der Shieh,et al.  A New Modular Exponentiation Architecture for Efficient Design of RSA Cryptosystem , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[20]  Sang-Geun Oh,et al.  Design and implementation of scalable low-power Montgomery multiplier , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[21]  Kooroush Manochehri,et al.  Modified radix-2 Montgomery modular multiplication to make it faster and simpler , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[22]  T. Kwasniewski,et al.  Low power design techniques for a Montgomery modular multiplier , 2005, 2005 International Symposium on Intelligent Signal Processing and Communication Systems.

[23]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[24]  Wilson Vicente Ruggiero,et al.  Towards an efficient implementation of sequential Montgomery Multiplication , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.

[25]  K. Manochehri,et al.  Fast Montgomery modular multiplication by pipelined CSA architecture , 2004, Proceedings. The 16th International Conference on Microelectronics, 2004. ICM 2004..

[26]  Tolga Acar,et al.  Analyzing and comparing Montgomery multiplication algorithms , 1996, IEEE Micro.

[27]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[28]  C. D. Walter,et al.  Montgomery exponentiation needs no final subtractions , 1999 .

[29]  Apostolos P. Fournaris,et al.  A new RSA encryption architecture and hardware implementation based on optimized Montgomery multiplication , 2005, 2005 IEEE International Symposium on Circuits and Systems.