Reducing circuit design complexity for neuromorphic machine learning systems based on Non-Volatile Memory arrays

Machine Learning (ML) is an attractive application of Non-Volatile Memory (NVM) arrays [1,2]. However, achieving speedup over GPUs will require minimal neuron circuit sharing and thus highly area-efficient peripheral circuitry, so that ML reads and writes are massively parallel and time-multiplexing is minimized [2]. This means that neuron hardware offering full ‘software-equivalent’ functionality is impractical. We analyze neuron circuit needs for implementing back-propagation in NVM arrays and introduce approximations to reduce design complexity and area. We discuss the interplay between circuits and NVM devices, such as the need for an occasional RESET step, the number of programming pulses to use, and the stochastic nature of NVM conductance change. In all cases we show that by leveraging the resilience of the algorithm to error, we can use practical circuit approaches yet maintain competitive test accuracies on ML benchmarks.

[1]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[2]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[3]  Geoffrey E. Hinton,et al.  A general framework for parallel distributed processing , 1986 .

[4]  Yusuf Leblebici,et al.  A 3.1mW 8b 1.2GS/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32nm digital SOI CMOS , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.

[5]  Y. Leblebici,et al.  Large-scale neural networks implemented with non-volatile memory as the synaptic weight element: Comparative performance analysis (accuracy, speed, and power) , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[6]  Yusuf Leblebici,et al.  A 3.1 mW 8b 1.2 GS/s Single-Channel Asynchronous SAR ADC With Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS , 2013, IEEE Journal of Solid-State Circuits.

[7]  G. W. Burr,et al.  Experimental demonstration and tolerancing of a large-scale neural network (165,000 synapses), using phase-change memory as the synaptic weight element , 2015, 2014 IEEE International Electron Devices Meeting.

[8]  Pritish Narayanan,et al.  Experimental Demonstration and Tolerancing of a Large-Scale Neural Network (165 000 Synapses) Using Phase-Change Memory as the Synaptic Weight Element , 2014, IEEE Transactions on Electron Devices.