X-MANN: A Crossbar based Architecture for Memory Augmented Neural Networks

Memory Augmented Neural Networks (MANNs) enhance a deep neural network with an external differentiable memory, enabling them to perform complex tasks well beyond the capabilities of conventional deep neural networks. We identify a unique challenge that arises in MANNs due to soft reads and writes to the differentiable memory, each of which requires access to all the memory locations. This characteristic of MANN workloads severely limits the performance of MANNs on CPUs, GPUs, and classical neural network accelerators. We present the first effort to design a hardware architecture that improves the efficiency of MANNs. Leveraging the intrinsic ability of resistive crossbars to efficiently realize in-memory computations, we propose X-MANN, a memory-centric crossbar-based architecture that is specialized to match the compute characteristics observed in MANNs. We design a transposable crossbar processing unit that can efficiently perform the different computational kernels of MANNs. To improve performance of soft writes in X-MANN, we propose an incremental write mechanism that leverages the characteristics of soft write operations. We develop an architectural simulator for X-MANN that utilizes array-level timing and power models of resistive crossbars calibrated from SPICE simulations. Across a suite of MANN benchmarks, X-MANN achieves 23.7×-45.7× speedup and 75.1×-267.1× reduction in energy over state-of-the-art GPU implementations.

[1]  Guido Torelli,et al.  A Bipolar-Selected Phase Change Memory Featuring Multi-Level Cell Storage , 2009, IEEE Journal of Solid-State Circuits.

[2]  Yusuf Leblebici,et al.  A 3.1 mW 8b 1.2 GS/s Single-Channel Asynchronous SAR ADC With Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS , 2013, IEEE Journal of Solid-State Circuits.

[3]  Yu Wang,et al.  TIME: A training-in-memory architecture for memristor-based deep neural networks , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[4]  Miao Hu,et al.  ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[5]  Catherine Graves,et al.  Dot-product engine for neuromorphic computing: Programming 1T1M crossbar to accelerate matrix-vector multiplication , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[6]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[7]  Wei Yang Lu,et al.  Nanoscale memristor device as synapse in neuromorphic systems. , 2010, Nano letters.

[8]  Dejan S. Milojicic,et al.  PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference , 2019, ASPLOS.

[9]  Sergio Gomez Colmenarejo,et al.  Hybrid computing using a neural network with dynamic external memory , 2016, Nature.

[10]  Kaushik Roy,et al.  Proposal for an All-Spin Artificial Neural Network: Emulating Neural and Synaptic Functionalities Through Domain Wall Motion in Ferromagnets , 2015, IEEE Transactions on Biomedical Circuits and Systems.

[11]  Kaushik Roy,et al.  SPINDLE: SPINtronic Deep Learning Engine for large-scale neuromorphic computing , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[12]  Daan Wierstra,et al.  One-shot Learning with Memory-Augmented Neural Networks , 2016, ArXiv.

[13]  Tayfun Gokmen,et al.  Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices , 2017, Front. Neurosci..

[14]  Kaushik Roy,et al.  RxNN: A Framework for Evaluating Deep Neural Networks on Resistive Crossbars , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[15]  Jason Weston,et al.  End-To-End Memory Networks , 2015, NIPS.