NTTU: An Area-Efficient Low-Power NTT-Uncoupled Architecture for NTT-Based Multiplication

Large integer multiplication, or large degree polynomial multiplication, is the most time-consuming operation in fully homomorphic encryption (FHE). Low area and power consumption are difficult to maintain while achieving high performance for a large size multiplier. To address this issue, an area-efficient low-power architecture for multiplication, named NTTU, is proposed in this article. First, a combined number theoretic transform (NTT) method consisting of decimation-in-time (DIT) NTT for input in natural order and bit-reversed order is proposed to eliminate the steps of zero padding, scramble, and the first stage in NTT, thereby achieving a reduction of <inline-formula><tex-math notation="LaTeX">$7N/2$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>7</mml:mn><mml:mi>N</mml:mi><mml:mo>/</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:math><inline-graphic xlink:href="liu-ieq1-2958334.gif"/></alternatives></inline-formula> clock cycles compared with the single-type NTT method. Second, the NTT-uncoupled architecture is proposed to uncouple the multiplication components, decreasing the storage space for coefficients by 1/2 compared with state-of-the-art designs. Third, a parallel computing architecture based on a crossed memory access scheme is proposed, therein reducing the corresponding execution time by one-half compared with serial execution. Synthesized using 65 nm technology, the proposed architecture can multiply two 1024k/768k integers in 1.7 ms at 500 MHz at a cost of 13.66/7.67 million gates and 726.7/550.2 mW, and a 71.17 percent/30.37 percent area time product (ATP) reduction is achieved compared with the state-of-the-art ASIC designs.

[1]  R. Bocu,et al.  A homomorphic encryption-based system for securely managing personal health metrics data , 2018, IBM J. Res. Dev..

[2]  Frederik Vercauteren,et al.  Somewhat Practical Fully Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[3]  Berk Sunar,et al.  Accelerating Fully Homomorphic Encryption in Hardware , 2015, IEEE Transactions on Computers.

[4]  Berk Sunar,et al.  Accelerating NTRU based homomorphic encryption using GPUs , 2014, 2014 IEEE High Performance Extreme Computing Conference (HPEC).

[5]  Xiaolin Cao,et al.  High-Speed Fully Homomorphic Encryption Over the Integers , 2014, Financial Cryptography Workshops.

[6]  Berk Sunar,et al.  A million-bit multiplier architecture for fully homomorphic encryption , 2014, Microprocess. Microsystems.

[7]  Jean-Sébastien Coron,et al.  Fully Homomorphic Encryption over the Integers with Shorter Public Keys , 2011, IACR Cryptol. ePrint Arch..

[8]  Berk Sunar,et al.  A Custom Accelerator for Homomorphic Encryption Applications , 2017, IEEE Transactions on Computers.

[9]  Arnold Schönhage,et al.  Schnelle Multiplikation großer Zahlen , 1971, Computing.

[10]  Frederik Vercauteren,et al.  High-Speed Polynomial Multiplication Architecture for Ring-LWE and SHE Cryptosystems , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.

[11]  Arnaud Tisserand,et al.  Towards FHE in Embedded Systems: A Preliminary Codesign Space Exploration of a HW/SW Very Large Multiplier , 2015, IEEE Embedded Systems Letters.

[12]  Craig Gentry,et al.  Implementing Gentry's Fully-Homomorphic Encryption Scheme , 2011, EUROCRYPT.

[13]  Berk Sunar,et al.  Evaluating the Hardware Performance of a Million-Bit Multiplier , 2013, 2013 Euromicro Conference on Digital System Design.

[14]  Ahmed El-Mahdy,et al.  Design Space Exploration for a Co-designed Accelerator Supporting Homomorphic Encryption , 2015, 2015 20th International Conference on Control Systems and Computer Science.

[15]  Jean-Sébastien Coron,et al.  Public Key Compression and Modulus Switching for Fully Homomorphic Encryption over the Integers , 2012, EUROCRYPT.

[16]  Frederik Vercauteren,et al.  Modular Hardware Architecture for Somewhat Homomorphic Function Evaluation , 2015, CHES.

[17]  Mustafa Khairallah,et al.  Tile-based modular architecture for accelerating homomorphic function evaluation on FPGA , 2016, 2016 IEEE 59th International Midwest Symposium on Circuits and Systems (MWSCAS).

[18]  Xinming Huang,et al.  VLSI Design of a Large-Number Multiplier for Fully Homomorphic Encryption , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[19]  Berk Sunar,et al.  Accelerating LTV Based Homomorphic Encryption in Reconfigurable Hardware , 2015, CHES.

[20]  Vinod Vaikuntanathan,et al.  SHIELD: Scalable Homomorphic Implementation of Encrypted Data-Classifiers , 2015, IEEE Transactions on Computers.

[21]  Xiang Feng,et al.  Design of an Area-Effcient Million-Bit Integer Multiplier Using Double Modulus NTT , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[22]  Indranil Sengupta,et al.  Sorting of Fully Homomorphic Encrypted Cloud Data: Can Partitioning be Effective? , 2020, IEEE Transactions on Services Computing.

[23]  Xuemin Sherman Shen,et al.  A Lightweight Lattice-Based Homomorphic Privacy-Preserving Data Aggregation Scheme for Smart Grid , 2018, IEEE Transactions on Smart Grid.

[24]  Oscar Gustafsson,et al.  Efficient FPGA Mapping of Pipeline SDF FFT Cores , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[25]  Alessandro Cilardo,et al.  Securing the cloud with reconfigurable computing: An FPGA accelerator for homomorphic encryption , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[26]  Nektarios Georgios Tsoutsos,et al.  The HEROIC Framework: Encrypted Computation Without Shared Keys , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[27]  Shousheng He,et al.  Design and implementation of a 1024-point pipeline FFT processor , 1998, Proceedings of the IEEE 1998 Custom Integrated Circuits Conference (Cat. No.98CH36143).

[28]  Xiaolin Cao,et al.  Optimised Multiplication Architectures for Accelerating Fully Homomorphic Encryption , 2016, IEEE Transactions on Computers.

[29]  Frederik Vercauteren,et al.  Compact Ring-LWE Cryptoprocessor , 2014, CHES.

[30]  Craig Gentry,et al.  Fully Homomorphic Encryption over the Integers , 2010, EUROCRYPT.

[31]  Lewis Johnson,et al.  Conflict free memory addressing for dedicated FFT hardware , 1992 .

[32]  Chaohui Du,et al.  Towards efficient polynomial multiplication for lattice-based cryptography , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[33]  Michael Naehrig,et al.  Improved Security for a Ring-Based Fully Homomorphic Encryption Scheme , 2013, IMACC.

[34]  Alhassan Khedr,et al.  SecureMed: Secure Medical Computation Using GPU-Accelerated Homomorphic Encryption Scheme , 2018, IEEE Journal of Biomedical and Health Informatics.

[35]  Jung Hee Cheon,et al.  Optimized Search-and-Compute Circuits and Their Application to Query Evaluation on Encrypted Data , 2016, IEEE Transactions on Information Forensics and Security.

[36]  Charles C. Weems,et al.  High Precision Integer Multiplication with a GPU , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[37]  Berk Sunar,et al.  Accelerating fully homomorphic encryption using GPU , 2012, 2012 IEEE Conference on High Performance Extreme Computing.

[38]  Vinod Vaikuntanathan,et al.  On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption , 2012, STOC '12.

[39]  Xiang Feng,et al.  Accelerating an FHE Integer Multiplier Using Negative Wrapped Convolution and Ping-Pong FFT , 2019, IEEE Transactions on Circuits and Systems II: Express Briefs.

[40]  Xinming Huang,et al.  FPGA implementation of a large-number multiplier for fully homomorphic encryption , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).