Accelerating subset sum and lattice based public-key cryptosystems with multi-core CPUs and GPUs

Abstract Post-quantum cryptosystems based on subset sum and lattice problems have gained much attention from researchers due to their simple construction, their resistance to quantum attacks, the new potential applications they provide, and above all, the mathematical security proofs that rigorously relate them to computational hard problems. However, the computational complexity of these cryptosystems is still high compared to classic number-theoretical ones, which may impede their adoption on a large scale. We studied the performance of three public-key cryptosystems based on subset sum, learning with errors and ring learning with errors problems. We provide a systematic study for choosing their parameters to guarantee sufficient security levels and detail an asymptotic comparison between them in terms of storage and running time complexities. We accelerate the running time of these cryptosystems by exploiting the inherent parallelism in computations through a GPGPU-based parallel implementation. The cryptosystems are implemented using C++ on Intel(R) Xeon(R) multi-core 64-bit processors machine with CUDA-enabled Tesla K80 GPUs. The parallel implementation is based on OpenCL framework and can run on arbitrary hardware platform accelerators with minor changes. Several optimizations and efficient algorithms were used to compute the core operations in each cryptosystem to achieve optimum performance. The ring learning with errors based cryptosystem showed the best performance while the Subset Sum cryptosystem showed the highest speedup gain for the encryption primitive.

[1]  Victor Shoup,et al.  A computational introduction to number theory and algebra , 2005 .

[2]  Franz Winkler,et al.  Polynomial Algorithms in Computer Algebra , 1996, Texts and Monographs in Symbolic Computation.

[3]  Tanja Lange,et al.  Post-quantum cryptography , 2008, Nature.

[4]  Claus-Peter Schnorr,et al.  Lattice basis reduction: Improved practical algorithms and solving subset sum problems , 1991, FCT.

[5]  Todd A. Brun,et al.  Quantum Computing , 2011, Computer Science, The Hardware, Software and Heart of It.

[6]  Oded Regev,et al.  On lattices, learning with errors, random linear codes, and cryptography , 2009, JACM.

[7]  Chris Peikert,et al.  On Ideal Lattices and Learning with Errors over Rings , 2010, JACM.

[8]  Martin E. Hellman,et al.  Hiding information and signatures in trapdoor knapsacks , 1978, IEEE Trans. Inf. Theory.

[9]  Ariel J. Feldman,et al.  Lest we remember: cold-boot attacks on encryption keys , 2008, CACM.

[10]  Diego Fasoli,et al.  Three Applications of GPU Computing in Neuroscience , 2012, Computing in Science & Engineering.

[11]  Miklós Ajtai,et al.  Generating Hard Instances of Lattice Problems , 1996, Electron. Colloquium Comput. Complex..

[12]  Antoine Joux,et al.  Improved low-density subset sum algorithms , 1992, computational complexity.

[13]  Miklós Ajtai,et al.  Generating hard instances of lattice problems (extended abstract) , 1996, STOC '96.

[14]  Ronald L. Rivest,et al.  A knapsack-type public key cryptosystem based on arithmetic in finite fields , 1988, IEEE Trans. Inf. Theory.

[15]  Berk Sunar,et al.  Accelerating fully homomorphic encryption using GPU , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[16]  Martin R. Albrecht,et al.  On the concrete hardness of Learning with Errors , 2015, J. Math. Cryptol..

[17]  Daniele Micciancio,et al.  On Bounded Distance Decoding, Unique Shortest Vectors, and the Minimum Distance Problem , 2009, CRYPTO.

[18]  Mingjie Liu,et al.  Solving BDD by Enumeration: An Update , 2013, CT-RSA.

[19]  Daniel J. Bernstein,et al.  Introduction to post-quantum cryptography , 2009 .

[20]  Serge Vaudenay,et al.  Cryptanalysis of the Chor-Rivest Cryptosystem , 1998, CRYPTO.

[21]  Yasuyuki Murakami,et al.  A Note on Security of Public-Key Cryptosystem Provably as Secure as Subset Sum Problem , 2014, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[22]  Tancrède Lepoint,et al.  NFLlib: NTT-Based Fast Lattice Library , 2016, CT-RSA.

[23]  Gil Segev,et al.  Public-Key Cryptographic Primitives Provably as Secure as Subset Sum , 2010, TCC.

[24]  Kim Laine,et al.  Key Recovery for LWE in Polynomial Time , 2015, IACR Cryptol. ePrint Arch..

[25]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[26]  C. Pomerance,et al.  Prime Numbers: A Computational Perspective , 2002 .

[27]  László Lovász,et al.  Factoring polynomials with rational coefficients , 1982 .

[28]  Ashish Jain,et al.  Cryptanalytic Results on Knapsack Cryptosystem Using Binary Particle Swarm Optimization , 2014, SOCO-CISIS-ICEUTE.

[29]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[30]  Manish Vachharajani,et al.  GPU acceleration of numerical weather prediction , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[31]  Chris Peikert,et al.  A Toolkit for Ring-LWE Cryptography , 2013, IACR Cryptol. ePrint Arch..

[32]  William E. Burr,et al.  Recommendation for Key Management, Part 1: General (Revision 3) , 2006 .

[33]  Vinod Vaikuntanathan,et al.  Simultaneous Hardcore Bits and Cryptography against Memory Attacks , 2009, TCC.

[34]  Karl Rupp,et al.  GPU-Accelerated Non-negative Matrix Factorization for Text Mining , 2012, NLDB.

[35]  Peter W. Shor,et al.  Algorithms for quantum computation: discrete logarithms and factoring , 1994, Proceedings 35th Annual Symposium on Foundations of Computer Science.

[36]  Adi Shamir,et al.  A polynomial time algorithm for breaking the basic Merkle-Hellman cryptosystem , 1982, 23rd Annual Symposium on Foundations of Computer Science (sfcs 1982).

[37]  Frederik Vercauteren,et al.  Efficient software implementation of ring-LWE encryption , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[38]  Zhe Liu,et al.  Efficient Ring-LWE Encryption on 8-Bit AVR Processors , 2015, CHES.

[39]  Tim Güneysu,et al.  Towards Efficient Arithmetic for Lattice-Based Cryptography on Reconfigurable Hardware , 2012, LATINCRYPT.

[40]  Chaohui Du,et al.  High-speed polynomial multiplier architecture for ring-LWE based public key cryptosystems , 2016, 2016 International Great Lakes Symposium on VLSI (GLSVLSI).

[41]  Gilles Brassard,et al.  Strengths and Weaknesses of Quantum Computing , 1997, SIAM J. Comput..