REED: Chiplet-Based Scalable Hardware Accelerator for Fully Homomorphic Encryption

Fully Homomorphic Encryption (FHE) has emerged as a promising technology for processing encrypted data without the need for decryption. Despite its potential, its practical implementation has faced challenges due to substantial computational overhead. To address this issue, we propose the $first$ chiplet-based FHE accelerator design `REED', which enables scalability and offers high throughput, thereby enhancing homomorphic encryption deployment in real-world scenarios. It incorporates well-known wafer yield issues during fabrication which significantly impacts production costs. In contrast to state-of-the-art approaches, we also address data exchange overhead by proposing a non-blocking inter-chiplet communication strategy. We incorporate novel pipelined Number Theoretic Transform and automorphism techniques, leveraging parallelism and providing high throughput. Experimental results demonstrate that REED 2.5D integrated circuit consumes 177 mm$^2$ chip area, 82.5 W average power in 7nm technology, and achieves an impressive speedup of up to 5,982$\times$ compared to a CPU (24-core 2$\times$Intel X5690), and 2$\times$ better energy efficiency and 50\% lower development cost than state-of-the-art ASIC accelerator. To evaluate its practical impact, we are the $first$ to benchmark an encrypted deep neural network training. Overall, this work successfully enhances the practicality and deployability of fully homomorphic encryption in real-world scenarios.

[1]  Jung Ho Ahn,et al.  SHARP: A Short-Word Hierarchical Accelerator for Robust and Practical Fully Homomorphic Encryption , 2023, ISCA.

[2]  M. Maniatakos,et al.  CoFHEE: A Co-processor for Fully Homomorphic Encryption Execution , 2023, 2023 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[3]  N. Kim,et al.  Demystifying CXL Memory with Genuine CXL-Ready Systems and Devices , 2023, ArXiv.

[4]  Jongshin Shin,et al.  A 4nm 1.15TB/s HBM3 Interface with Resistor-Tuned Offset-Calibration and In-Situ Margin-Detection , 2023, 2023 IEEE International Solid- State Circuits Conference (ISSCC).

[5]  M. Zhang,et al.  Poseidon: Practical Homomorphic Encryption Accelerator , 2023, 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[6]  M. Zhang,et al.  TensorFHE: Achieving Practical Computation on Encrypted Data Using GPGPU , 2022, 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[7]  Ahmad Al Badawi,et al.  OpenFHE: Open-Source Fully Homomorphic Encryption Library , 2022, IACR Cryptol. ePrint Arch..

[8]  Aikata,et al.  Medha: Microcoded Hardware Accelerator for computing on Encrypted Data , 2022, IACR Cryptol. ePrint Arch..

[9]  R. Cheung,et al.  PipeNTT: A Pipelined Number Theoretic Transform Architecture , 2022, IEEE Transactions on Circuits and Systems II: Express Briefs.

[10]  Steve Mansfield-Devine IBM: Cost of a Data Breach , 2022, Network Security.

[11]  Leo de Castro,et al.  FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption , 2022, 2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[12]  Karim M. El Defrawy,et al.  CraterLake: a hardware accelerator for efficient unbounded computation on encrypted data , 2022, ISCA.

[13]  Jung Ho Ahn,et al.  ARK: Fully Homomorphic Encryption Accelerator with Runtime Data Generation and Inter-Operation Key Reuse , 2022, 2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO).

[14]  Xuyi Cai,et al.  Survey on chiplets: interface, interconnect and integration methodology , 2022, CCF Trans. High Perform. Comput..

[15]  V. Kindratenko,et al.  Nvidia Hopper GPU and Grace CPU Highlights , 2022, Computing in Science & Engineering.

[16]  Seokwoo Choi,et al.  A 192-Gb 12-High 896-GB/s HBM3 DRAM with a TSV Auto-Calibration Scheme and Machine-Learning-Based Layout Optimization , 2022, 2022 IEEE International Solid- State Circuits Conference (ISSCC).

[17]  Jung Ho Ahn,et al.  BTS: an accelerator for bootstrappable fully homomorphic encryption , 2021, ISCA.

[18]  Daniel Sánchez,et al.  F1: A Fast and Programmable Accelerator for Fully Homomorphic Encryption , 2021, MICRO.

[19]  Fan Long,et al.  PipeZK: Accelerating Zero-Knowledge Proof with a Pipelined Architecture , 2021, 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA).

[20]  Jun Han,et al.  A Multi-Layer Parallel Hardware Architecture for Homomorphic Computation in Machine Learning , 2021, 2021 IEEE International Symposium on Circuits and Systems (ISCAS).

[21]  L. Benini,et al.  Manticore: A 4096-Core RISC-V Chiplet Architecture for Ultraefficient Floating-Point Computing , 2020, IEEE Micro.

[22]  Hsien-Hsin S. Lee,et al.  Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference , 2020, 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA).

[23]  X. Hu,et al.  Computing-in-Memory for Performance and Energy-Efficient Homomorphic Encryption , 2020, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Jung Hee Cheon,et al.  Accelerating Fully Homomorphic Encryption Through Architecture-Centric Analysis and Optimization , 2020, IEEE Access.

[25]  Christian Bernard,et al.  2.3 A 220GOPS 96-Core Processor with 6 Chiplets 3D-Stacked on an Active Interposer Offering 0.6ns/mm Latency, 3Tb/s/mm2 Inter-Chiplet Interconnects and 156mW/mm2@ 82%-Peak-Efficiency DC-DC Converters , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).

[26]  Michael Niemier,et al.  A Computing-in-Memory Engine for Searching on Homomorphically Encrypted Data , 2019, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.

[27]  Kim Laine,et al.  HEAX: An Architecture for Computing on Encrypted Data , 2019, ASPLOS.

[28]  Khin Mi Mi Aung,et al.  PrivFT: Private and Fast Text Classification With Homomorphic Encryption , 2019, IEEE Access.

[29]  Erkay Savas,et al.  Design and Implementation of a Fast and Scalable NTT-Based Polynomial Multiplier Architecture , 2019, 2019 22nd Euromicro Conference on Digital System Design (DSD).

[30]  Jung Hee Cheon,et al.  Logistic Regression on Homomorphic Encrypted Data at Scale , 2019, AAAI.

[31]  Hao Chen,et al.  Improved Bootstrapping for Approximate Homomorphic Encryption , 2019, IACR Cryptol. ePrint Arch..

[32]  Nicolas Gama,et al.  TFHE: Fast Fully Homomorphic Encryption Over the Torus , 2019, Journal of Cryptology.

[33]  Frederik Vercauteren,et al.  FPGA-Based High-Performance Parallel Architecture for Homomorphic Computing on Encrypted Data , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[34]  Frederik Vercauteren,et al.  HEPCloud: An FPGA-Based Multicore Processor for FV Somewhat Homomorphic Function Evaluation , 2018, IEEE Transactions on Computers.

[35]  Jung Hee Cheon,et al.  Logistic regression model training based on the approximate homomorphic encryption , 2018, BMC Medical Genomics.

[36]  Jung Hee Cheon,et al.  A Full RNS Variant of Approximate Homomorphic Encryption , 2018, IACR Cryptol. ePrint Arch..

[37]  Natalie D. Enright Jerger,et al.  Modular Routing Design for Chiplet-Based Systems , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).

[38]  Jung Hee Cheon,et al.  Bootstrapping for Approximate Homomorphic Encryption , 2018, IACR Cryptol. ePrint Arch..

[39]  Michael Scott,et al.  A Note on the Implementation of the Number Theoretic Transform , 2017, IMACC.

[40]  Jung Hee Cheon,et al.  Homomorphic Encryption for Arithmetic of Approximate Numbers , 2017, ASIACRYPT.

[41]  J. Zeng,et al.  A 7nm CMOS technology platform for mobile and high performance compute application , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).

[42]  Saurabh Sinha,et al.  ASAP7: A 7-nm finFET predictive process design kit , 2016, Microelectron. J..

[43]  Jaejin Lee,et al.  Design considerations of HBM stacked DRAM and the memory architecture extension , 2015, 2015 IEEE Custom Integrated Circuits Conference (CICC).

[44]  Martin R. Albrecht,et al.  On the concrete hardness of Learning with Errors , 2015, J. Math. Cryptol..

[45]  Frederik Vercauteren,et al.  Modular Hardware Architecture for Somewhat Homomorphic Function Evaluation , 2015, CHES.

[46]  Masahiro Yagisawa,et al.  Fully Homomorphic Encryption without bootstrapping , 2015, IACR Cryptol. ePrint Arch..

[47]  Xinming Huang,et al.  VLSI Design of a Large-Number Multiplier for Fully Homomorphic Encryption , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[48]  Xinming Huang,et al.  FPGA implementation of a large-number multiplier for fully homomorphic encryption , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[49]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[50]  Paul D. Franzon,et al.  Design automation for a 3DIC FFT processor for synthetic aperture radar: A case study , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[51]  K. Järvelin Evaluation , 2008, Interactive Information Seeking, Behaviour and Retrieval.

[52]  P. L. Montgomery Modular multiplication without trial division , 1985 .

[53]  Harvey L. Garner,et al.  RESIDUE NUMBER SYSTEM ENHANCEMENTS FOR PROGRAMMABLE PROCESSORS , 2008 .

[54]  Pankaj Gupta,et al.  Chiplets: How Small is too Small? , 2023 .

[55]  Michiel Van Beirendonck,et al.  BASALISC: Flexible Asynchronous Hardware Accelerator for Fully Homomorphic Encryption , 2022, IACR Cryptol. ePrint Arch..

[56]  Joe L. Gonzalez,et al.  Heterogeneous Integration of Chiplets Using Socketed Platforms, Off-Chip Flexible Interconnects, and Self-Alignment Technologies , 2021 .

[57]  Jung Hee Cheon,et al.  Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs , 2021, IACR Cryptol. ePrint Arch..

[58]  Jean-Pierre Hubaux,et al.  Efficient Bootstrapping for Approximate Homomorphic Encryption with Non-Sparse Keys , 2020, IACR Cryptol. ePrint Arch..

[59]  Michael Niemier,et al.  Algorithmic Acceleration of B/FV-like Somewhat Homomorphic Encryption for Compute-Enabled RAM , 2020, IACR Cryptol. ePrint Arch..

[60]  Frederik Vercauteren,et al.  Somewhat Practical Fully Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[61]  Craig Gentry,et al.  A fully homomorphic encryption scheme , 2009 .

[62]  Ronald L. Rivest,et al.  ON DATA BANKS AND PRIVACY HOMOMORPHISMS , 1978 .