FPGA-based Accelerators of Fully Pipelined Modular Multipliers for Homomorphic Encryption

Homomorphic encryption (HE) is an important cryptographic primitive which allows privacy preserving computations. Current HE schemes are all based on modular arithmetic. Modular multiplication (ModMult) is one of the most frequently used modular operations, but in practice it is often prohibitively slow due to a reduction operation with high computational complexity. To address this speed problem, we demonstrate a set of novel FPGA-based accelerators for fully pipelined ModMults in this paper. For a high-throughput integer multiplier (IntMult) in the ModMult designs, digital signal processing (DSP) slices on FPGAs are efficiently exploited with optimized IntMult designs. For the full RNS-HEAAN scheme, which is our target HE scheme, our proposed Barrett ModMult design is optimized using specific moduli and extended to the Shoup ModMult algorithm. Our proposed Barrett and Shoup ModMult designs implemented on a Xilinx Virtex UltraScale FPGA show a 2 × shorter delay, 14× higher throughput at the same frequency, and 3× higher throughput/DSP than the previous non-fully pipelined Barrett ModMult design on average. In particular, our Barrett ModMult design with the specific moduli shows the highest throughput/DSP value although precomputation required in the Shoup ModMult design is not used. Compared with a reference software implementation, our ModMult designs show 679× faster average processing speeds when we deploy multiple ModMult cores that fully use DSP slices on our target FPGA.

[1]  David Harvey,et al.  Faster arithmetic for number-theoretic transforms , 2012, J. Symb. Comput..

[2]  Paul Barrett,et al.  Implementing the Rivest Shamir and Adleman Public Key Encryption Algorithm on a Standard Digital Signal Processor , 1986, CRYPTO.

[3]  Jung Hee Cheon,et al.  Homomorphic Encryption for Arithmetic of Approximate Numbers , 2017, ASIACRYPT.

[4]  Xiaolin Cao,et al.  Optimised Multiplication Architectures for Accelerating Fully Homomorphic Encryption , 2016, IEEE Transactions on Computers.

[5]  Hyungbo Shim,et al.  Toward a Secure Drone System: Flying With Real-Time Homomorphic Authenticated Encryption , 2018, IEEE Access.

[6]  Frederik Vercauteren,et al.  HEPCloud: An FPGA-Based Multicore Processor for FV Somewhat Homomorphic Function Evaluation , 2018, IEEE Transactions on Computers.

[7]  Berk Sunar,et al.  A Custom Accelerator for Homomorphic Encryption Applications , 2017, IEEE Transactions on Computers.

[8]  Jung Hee Cheon,et al.  A Full RNS Variant of Approximate Homomorphic Encryption , 2018, IACR Cryptol. ePrint Arch..

[9]  Jung Hee Cheon,et al.  Logistic regression model training based on the approximate homomorphic encryption , 2018, BMC Medical Genomics.

[10]  Xinming Huang,et al.  FPGA implementation of a large-number multiplier for fully homomorphic encryption , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[11]  Michael Naehrig,et al.  Accelerating Homomorphic Evaluation on Reconfigurable Hardware , 2015, CHES.

[12]  Chaohui Du,et al.  High-speed polynomial multiplier architecture for ring-LWE based public key cryptosystems , 2016, 2016 International Great Lakes Symposium on VLSI (GLSVLSI).

[13]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[14]  Hao Chen,et al.  Simple Encrypted Arithmetic Library - SEAL v2.1 , 2016, Financial Cryptography Workshops.

[15]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[16]  Hao Chen,et al.  CHET: an optimizing compiler for fully-homomorphic neural-network inferencing , 2019, PLDI.

[17]  Christof Paar,et al.  Efficient hardware architectures for modular multiplication on FPGAs , 2005, International Conference on Field Programmable Logic and Applications, 2005..

[18]  Pascal Paillier,et al.  Fast Homomorphic Evaluation of Deep Discretized Neural Networks , 2018, IACR Cryptol. ePrint Arch..

[19]  Erdem Ozcan,et al.  A fast digit based Montgomery multiplier designed for FPGAs with DSP resources , 2018, Microprocess. Microsystems.

[20]  Shai Halevi,et al.  Algorithms in HElib , 2014, CRYPTO.

[21]  Frederik Vercauteren,et al.  High-Speed Polynomial Multiplication Architecture for Ring-LWE and SHE Cryptosystems , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.

[22]  Berk Sunar,et al.  Accelerating Fully Homomorphic Encryption in Hardware , 2015, IEEE Transactions on Computers.

[23]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[24]  Julien Eynard,et al.  A Full RNS Variant of FV Like Somewhat Homomorphic Encryption Schemes , 2016, SAC.