Design and optimization of FeFET-based crossbars for binary convolution neural networks

Binary convolution neural networks (CNNs) have attracted much attention for embedded applications due to low hardware cost and acceptable accuracy. Nonvolatile, resistive random-access memories (RRAMs) have been adopted to build crossbar accelerators for binary CNNs. However, RRAMs still face fundamental challenges such as sneak paths, high write energy, etc. We exploit another emerging nonvolatile device-ferroelectric field-effect transistor (FeFET), to build crossbars to improve the energy efficiency for binary CNNs. Due to the three-terminal transistor structure, an FeFET can function as both a nonvolatile storage element and a controllable switch, such that both write and read power can be reduced. Simulation results demonstrate that compared with two RRAM-based crossbar structures, our FeFET-based design improves write power by 5600× and 3950×, and read power by 4.1× and 3.1×. We also tackle an important challenge in crossbar-based CNN accelerators: when a crossbar array is not large enough to hold the weights of one convolution layer, how do we partition the workload and map computations to the crossbar array? We introduce a hardware-software co-optimization solution for this problem that is universal for any crossbar accelerators.

[1]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[2]  Yu Wang,et al.  TIME: A training-in-memory architecture for memristor-based deep neural networks , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[3]  Narayanan Vijaykrishnan,et al.  Ferroelectric Transistor based Non-Volatile Flip-Flop , 2016, ISLPED.

[4]  Zhiwei Li,et al.  Binary neural network with 16 Mb RRAM macro chip for classification and online training , 2016, 2016 IEEE International Electron Devices Meeting (IEDM).

[5]  Michael T. Niemier,et al.  Exploiting ferroelectric FETs for low-power non-volatile logic-in-memory circuits , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[6]  H.-S. Philip Wong,et al.  Face classification using electronic synapses , 2017, Nature Communications.

[7]  Narayanan Vijaykrishnan,et al.  Nonvolatile memory design based on ferroelectric FETs , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Yu Wang,et al.  RRAM-Based Analog Approximate Computing , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[10]  Yu Wang,et al.  MErging the Interface: Power, area and accuracy co-optimization for RRAM crossbar-based mixed-signal computing system , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[11]  M. Trentzsch,et al.  A 28nm HKMG super low power embedded NVM technology based on ferroelectric FETs , 2016, 2016 IEEE International Electron Devices Meeting (IEDM).

[12]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[13]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[14]  Farnood Merrikh-Bayat,et al.  Training and operation of an integrated neuromorphic network based on metal-oxide memristors , 2014, Nature.

[15]  Yu Wang,et al.  Switched by input: Power efficient structure for RRAM-based convolutional neural network , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[16]  Narayanan Vijaykrishnan,et al.  Device Circuit Co Design of FEFET Based Logic for Low Voltage Processors , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[17]  Yu Wang,et al.  Binary convolutional neural network on RRAM , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).

[18]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[19]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[20]  Meng-Fan Chang,et al.  Advancing Nonvolatile Computing With Nonvolatile NCFET Latches and Flip-Flops , 2017, IEEE Transactions on Circuits and Systems I: Regular Papers.

[21]  C. W. Liu,et al.  Physical mechanism of HfO2-based bipolar resistive random access memory , 2011, Proceedings of 2011 International Symposium on VLSI Technology, Systems and Applications.

[22]  S. Datta,et al.  Use of negative capacitance to provide voltage amplification for low power nanoscale devices. , 2008, Nano letters.

[23]  Suman Datta,et al.  NCFET Based Logic for Energy Harvesting Systems , 2015 .

[24]  Michael T. Niemier,et al.  Design and benchmarking of ferroelectric FET based TCAM , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[25]  Cong Xu,et al.  Design implications of memristor-based RRAM cross-point structures , 2011, 2011 Design, Automation & Test in Europe.

[26]  D. Bouvet,et al.  Low voltage Ferroelectric FET with sub-100nm copolymer P(VDF-TrFE) gate dielectric for non-volatile 1T memory , 2008, ESSDERC 2008 - 38th European Solid-State Device Research Conference.

[27]  Y. Leblebici,et al.  Large-scale neural networks implemented with non-volatile memory as the synaptic weight element: Comparative performance analysis (accuracy, speed, and power) , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[28]  Sergei V. Kalinin,et al.  Ferroelectric hafnium oxide: A CMOS-compatible and highly scalable approach to future ferroelectric memories , 2013, 2013 IEEE International Electron Devices Meeting.

[29]  Tao Zhang,et al.  Overcoming the challenges of crossbar resistive memory architectures , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[30]  Rajiv V. Joshi,et al.  An energy-efficient matrix multiplication accelerator by distributed in-memory computing on binary RRAM crossbar , 2016, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC).

[31]  S. Datta,et al.  Physics-Based Circuit-Compatible SPICE Model for Ferroelectric Transistors , 2016, IEEE Electron Device Letters.