Improving Dependability of Neuromorphic Computing With Non-Volatile Memory

As process technology continues to scale aggressively, circuit aging in a neuromorphic hardware due to negative bias temperature instability (NBTI) and time-dependent dielectric breakdown (TDDB) is becoming a critical reliability issue and is expected to proliferate when using non-volatile memory (NVM) for synaptic storage. This is because NVM devices require high voltages and currents to access their synaptic weights, which further accelerate the circuit aging in neuromorphic hardware. Current methods for qualifying reliability are overly conservative, since they estimate circuit aging considering worst-case operating conditions and unnecessarily constrain performance. This paper proposes RENEU, a reliability-oriented approach to map machine learning applications to neuromorphic hardware, with the aim of improving system-wide reliability, without compromising key performance metrics such as execution time of these applications on the hardware. Fundamental to RENEU is a novel formulation of the aging of CMOS-based circuits in a neuromorphic hardware considering different failure mechanisms. Using this formulation, RENEU develops a system- wide reliability model which can be used inside a design-space exploration framework involving the mapping of neurons and synapses to the hardware. To this end, RENEU uses an instance of Particle Swarm Optimization (PSO) to generate mappings that are Pareto-optimal in terms of performance and reliability. We evaluate RENEU using different machine learning applications on a state-of-the-art neuromorphic hardware with NVM synapses. Our results demonstrate an average 38% reduction in circuit aging, leading to an average 18% improvement in the lifetime of the hardware compared to current practices. RENEU only introduces a marginal performance overhead of 5% compared to a performance-oriented state-of-the-art.

[1]  Nagarajan Kandasamy,et al.  Enabling and Exploiting Partition-Level Parallelism (PALP) in Phase Change Memories , 2019, ACM Trans. Embed. Comput. Syst..

[2]  Kaushik Roy,et al.  SPINDLE: SPINtronic Deep Learning Engine for large-scale neuromorphic computing , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[3]  D. Ielmini,et al.  Reliability study of phase-change nonvolatile memories , 2004, IEEE Transactions on Device and Materials Reliability.

[4]  Nagarajan Kandasamy,et al.  Compiling Spiking Neural Networks to Neuromorphic Hardware , 2020, LCTES.

[5]  Adarsha Balaji,et al.  A Framework for the Analysis of Throughput-Constraints of SNNs on Neuromorphic Hardware , 2019, 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[6]  Dongil Choi,et al.  HCI Improvement on 14nm FinFET IO Device by Optimization of 3D Junction Profile , 2019, 2019 IEEE International Reliability Physics Symposium (IRPS).

[7]  Nagarajan Kandasamy,et al.  Exploiting inter- and intra-memory asymmetries for data mapping in hybrid tiered-memories , 2020, ISMM.

[8]  Naoto Horiguchi,et al.  New methodology for modelling MOL TDDB coping with variability , 2018, 2018 IEEE International Reliability Physics Symposium (IRPS).

[9]  Amit Kumar Singh,et al.  Execution Trace--Driven Energy-Reliability Optimization for Multimedia MPSoCs , 2015, ACM Trans. Reconfigurable Technol. Syst..

[10]  Russell C. Eberhart,et al.  A new optimizer using particle swarm theory , 1995, MHS'95. Proceedings of the Sixth International Symposium on Micro Machine and Human Science.

[11]  B. Gleixner,et al.  Reliability characterization of Phase Change Memory , 2009, 2009 10th Annual Non-Volatile Memory Technology Symposium (NVMTS).

[12]  Y. Dan,et al.  Spike Timing-Dependent Plasticity of Neural Circuits , 2004, Neuron.

[13]  Mark Porter,et al.  Making the connection between physics of failure and system-level reliability for medical devices , 2018, 2018 IEEE International Reliability Physics Symposium (IRPS).

[14]  Giacomo Indiveri,et al.  A Scalable Multicore Architecture With Heterogeneous Memory Structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs) , 2017, IEEE Transactions on Biomedical Circuits and Systems.

[15]  Hong Wang,et al.  Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[16]  Nagarajan Kandasamy,et al.  Improving phase change memory performance with data content aware access , 2020, ISMM.

[17]  Francky Catthoor,et al.  Heartbeat Classification in Wearables Using Multi-layer Perceptron and Time-Frequency Joint Distribution of ECG , 2018, 2018 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE).

[18]  S. Amari,et al.  Closed-form expressions for distribution of sum of exponential random variables , 1997 .

[19]  Dimitri Linten,et al.  NBTI-Generated Defects in Nanoscaled Devices: Fast Characterization Methodology and Modeling , 2017, IEEE Transactions on Electron Devices.

[20]  Wofgang Maas,et al.  Networks of spiking neurons: the third generation of neural network models , 1997 .

[21]  Pradip Bose,et al.  The case for lifetime reliability-aware microprocessors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[22]  Bharadwaj Veeravalli,et al.  Energy-aware task mapping and scheduling for reliable embedded computing systems , 2014, ACM Trans. Embed. Comput. Syst..

[23]  Jun Yang,et al.  A low power and reliable charge pump design for Phase Change Memories , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[24]  Pritish Narayanan,et al.  Neuromorphic computing using non-volatile memory , 2017 .

[25]  Francky Catthoor,et al.  Mapping of local and global synapses on spiking neuromorphic hardware , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[26]  Vittorio Dante,et al.  A VLSI recurrent network of integrate-and-fire neurons connected by plastic synapses with long-term memory , 2003, IEEE Trans. Neural Networks.

[27]  Andrew S. Cassidy,et al.  TrueNorth: Accelerating From Zero to 64 Million Neurons in 10 Years , 2019, Computer.

[28]  Wenguang Chen,et al.  Bridge the Gap between Neural Networks and Neuromorphic Hardware with a Neural Network Compiler , 2017, ASPLOS.

[29]  Shimeng Yu,et al.  Reliability perspective of resistive synaptic devices on the neuromorphic system performance , 2018, 2018 IEEE International Reliability Physics Symposium (IRPS).