sRSA: High Speed RSA on the Intel MIC Architecture

RSA cryptography provides key functions for signing/verifying digital signatures and encrypting/decrypting shared secrets, and is broadly deployed to ensure secure endto-end communications nowadays. However, the adoption of RSA-enabled applications is fairly limited mainly due to the large computation overheads, in particular, of cryptographic operations with the RSA private key. In this paper, we design and implement sRSA, a high speed RSA on the new Intel® Many Integrated Core (MIC) architecture. We introduce several optimization strategies to sRSA for the MIC architecture without jeopardizing the security level. For example, 1) sRSA explicitly and efficiently vectorizes the underlying cryptographic primitives with the 512-bit vector registers; 2) It also integrates the advanced algorithmic features of fast RSA variants and many other fine-grained implementations; 3) It is thoroughly designed to resist applicable RSA system attacks, i.e., the factoring attacks and the CRT exponent attack. In the end, we evaluate the performance of sRSA and compare it with the industry-standard OpenSSL. The benchmark result shows that sRSA retains a comparable latency with, but demonstrates a much higher throughput than OpenSSL on both CPU and MIC based Phi coprocessor.

[1]  Arjen K. Lenstra,et al.  Factorization of a 768-Bit RSA Modulus , 2010, CRYPTO.

[2]  Dan Boneh,et al.  TWENTY YEARS OF ATTACKS ON THE RSA CRYPTOSYSTEM , 1999 .

[3]  Dan Boneh,et al.  Fast Variants of RSA , 2007 .

[4]  Alfred Menezes,et al.  Handbook of Applied Cryptography , 2018 .

[5]  Arjen K. Lenstra,et al.  Unbelievable Security. Matching AES Security Using Public Key Systems , 2001, ASIACRYPT.

[6]  Michael J. Wiener,et al.  Cryptanalysis of Short RSA Secret Exponents (Abstract) , 1990, EUROCRYPT.

[7]  Seungyeop Han,et al.  SSLShader: Cheap SSL Acceleration with Commodity Processors , 2011, NSDI.

[8]  Cheng Chang,et al.  Vectorized Big Integer Operations for Cryptosystems on the Intel MIC Architecture , 2015, 2015 IEEE 22nd International Conference on High Performance Computing (HiPC).

[9]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[10]  James Reinders,et al.  Intel Xeon Phi Coprocessor High Performance Programming , 2013 .

[11]  M. Jason Hinek,et al.  On the security of multi-prime RSA , 2008, J. Math. Cryptol..

[12]  Hung-Min Sun,et al.  An Approach Towards Rebalanced RSA-CRT with Short Public Exponent , 2005, IACR Cryptol. ePrint Arch..

[13]  Bernard P. Zajac Applied cryptography: Protocols, algorithms, and source code in C , 1994 .

[14]  Matthijs J. Coster,et al.  Addition Chain Heuristics , 1989, CRYPTO.

[15]  Greg Childers Factorization of a 1061-bit number by the Special Number Field Sieve , 2012, IACR Cryptol. ePrint Arch..

[16]  M. Jason Hinek,et al.  On Some Attacks on Multi-prime RSA , 2002, Selected Areas in Cryptography.

[17]  Elaine B. Barker,et al.  Transitions: Recommendation for Transitioning the Use of Cryptographic Algorithms and Key Lengths , 2011 .