Efficient Polynomial Multiplication via Modified Discrete Galois Transform and Negacyclic Convolution

Univariate polynomial multiplication in \(\mathbb {Z}_q[x]/\langle x^n+1 \rangle \) has brought great attention recently. Thanks to new construction of cryptographic solutions based on lattice and ring-learning with errors problems. A number of software libraries, such as NTL and FLINT, implements fast multiplication algorithms to perform this operation efficiently. The basic notion behind fast polynomial multiplication algorithms is based on the relation between multiplication and convolution which can be computed efficiently via fast Fourier transform (FFT) algorithms. Hence, efficient FFT is crucial to improve fast multiplication performance. An interesting algorithm that cuts FFT length in half is based on the discrete Gaussian transform (DGT). DGT was first proposed to work only with primes that support Gaussian integers arithmetic known as Gaussian primes. We modify this algorithm to work with not necessarily Gaussian primes and show how its parameters can be found efficiently. We introduce an array of optimization techniques to enhance the performance on commodity 64-bit machines. The proposed algorithm is implemented in C++ and compared with mature and highly optimized number theory libraries, namely, NTL and FLINT. The experiments show that our algorithm performs faster than both libraries and achieves speedup factors ranging from 1.01x–1.2x and 1.18x–1.55x compared to NTL and FLINT, respectively.

[1]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[2]  W. M. Gentleman,et al.  Fast Fourier Transforms: for fun and profit , 1966, AFIPS '66 (Fall).

[3]  Reiner Creutzburg,et al.  Parameter determination for complex number-theoretic transforms using cyclotomic polynomials , 1989 .

[4]  Franz Winkler,et al.  Polynomial Algorithms in Computer Algebra , 1996, Texts and Monographs in Symbolic Computation.

[5]  Joachim von zur Gathen,et al.  Modern Computer Algebra , 1998 .

[6]  R. Crandall Integer convolution via split-radix fast Galois transform , 1999 .

[7]  M. A. Hernández Chebyshev's Approximation Algorithms and Applications , 2001 .

[8]  R. Blahut Algebraic Codes for Data Transmission , 2002 .

[9]  Vincent Rijmen,et al.  The Design of Rijndael: AES - The Advanced Encryption Standard , 2002 .

[10]  Arnold Schönhage,et al.  Schnelle Multiplikation großer Zahlen , 1971, Computing.

[11]  Chris Peikert,et al.  SWIFFT: A Modest Proposal for FFT Hashing , 2008, FSE.

[12]  Charles C. Weems,et al.  High Precision Integer Multiplication with a GPU Using Strassen's Algorithm with Multiple FFT Sizes , 2011, Parallel Process. Lett..

[13]  Craig Gentry,et al.  (Leveled) fully homomorphic encryption without bootstrapping , 2012, ITCS '12.

[14]  Frederik Vercauteren,et al.  Somewhat Practical Fully Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[15]  Chris Peikert,et al.  A Toolkit for Ring-LWE Cryptography , 2013, IACR Cryptol. ePrint Arch..

[16]  Michael Naehrig,et al.  Improved Security for a Ring-Based Fully Homomorphic Encryption Scheme , 2013, IMACC.

[17]  William B. Hart,et al.  FLINT : Fast library for number theory , 2013 .

[18]  Chris Peikert,et al.  On Ideal Lattices and Learning with Errors over Rings , 2010, JACM.

[19]  Sedat Akleylek,et al.  On the Efficiency of Polynomial Multiplication for Lattice-Based Cryptography on GPUs Using CUDA , 2015, BalkanCryptSec.

[20]  Frederik Vercauteren,et al.  High-Speed Polynomial Multiplication Architecture for Ring-LWE and SHE Cryptosystems , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.

[21]  Berk Sunar,et al.  cuHE: A Homomorphic Encryption Accelerator Library , 2015, IACR Cryptol. ePrint Arch..