Parallel Stateful Logic in RRAM: Theoretical Analysis and Arithmetic Design

Processing-in-memory (PIM) provides massive parallelism with high energy efficiency and becomes a promising solution to the memory wall problem. Recently, the emerging metal-oxide resistive random access memory (RRAM) has shown its potential to design a PIM architecture. Several stateful logic operations, e.g., NOR and NAND, can be executed in parallel in an RRAM crossbar. Although previous works have designed some algorithms using the stateful logic, it is still under exploration how to fully exploit its potential high parallelism and design an asymptotically fast algorithm for a given function. In this work, we theoretically analyze the parallelism in an RRAM crossbar and design several asymptotically optimal arithmetic algorithms. In detail, we first propose the Single Instruction Multiple Lines (SIML) model to unify the stateful logic families and prove three lower bounds on the time complexity of a parallel RRAM algorithm. Then, we design three algorithms for integer addition functions with the stateful logic, guided by the lower bound analysis. All of them reach the time complexity lower bound. Finally, We make two extensions of the integer addition algorithms, supporting multiplication functions by decomposing them to additions and supporting the flex-point data type by proposing an exponent and mantissa update flow. Experimental evaluation shows that our integer algorithms achieves a speedup up to 13.79x over the previous RRAM algorithms. Our flex-point implementation achieves a 26.60x speedup and saves 73.68% energy compared to an ARM.

[1]  Hisashi Shima,et al.  Resistive Random Access Memory (ReRAM) Based on Metal Oxides , 2010, Proceedings of the IEEE.

[2]  Ameer Haj-Ali,et al.  IMAGING: In-Memory AlGorithms for Image processiNG , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[3]  Shahar Kvatinsky,et al.  Efficient Algorithms for In-Memory Fixed Point Multiplication Using MAGIC , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).

[4]  Paul D. Franzon,et al.  FreePDK: An Open-Source Variation-Aware Design Kit , 2007, 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07).

[5]  Lifeng Liu,et al.  Reconfigurable Nonvolatile Logic Operations in Resistance Switching Crossbar Array for Large‐Scale Circuits , 2016, Advanced materials.

[6]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[7]  Mohsen Imani,et al.  Ultra-efficient processing in-memory for data intensive applications , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[8]  Gregory S. Snider,et al.  ‘Memristive’ switches enable ‘stateful’ logic operations via material implication , 2010, Nature.

[9]  Li Zheng,et al.  Airgap Interconnects: Modeling, Optimization, and Benchmarking for Backplane, PCB, and Interposer Applications , 2014, IEEE Transactions on Components, Packaging and Manufacturing Technology.

[10]  Xiaochen Peng,et al.  XNOR-RRAM: A scalable and parallel resistive synaptic architecture for binary neural networks , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[11]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[12]  Tajana Simunic,et al.  FELIX: Fast and Energy-Efficient Logic in Memory , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[13]  Heng-Yuan Lee,et al.  A 4Mb embedded SLC resistive-RAM macro with 7.2ns read-write random-access time and 160ns MLC-access capability , 2011, 2011 IEEE International Solid-State Circuits Conference.

[14]  H. Li,et al.  A learnable parallel processing architecture towards unity of memory and computing , 2015, Scientific Reports.

[15]  Ru Huang,et al.  Nonvolatile memristor as a new platform for non-von Neumann computing , 2018, 2018 14th IEEE International Conference on Solid-State and Integrated Circuit Technology (ICSICT).

[16]  Cong Xu,et al.  Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[17]  Uri C. Weiser,et al.  MAGIC—Memristor-Aided Logic , 2014, IEEE Transactions on Circuits and Systems II: Express Briefs.

[18]  Hao Jiang,et al.  A spiking neuromorphic design with resistive crossbar , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[19]  Nishil Talati,et al.  Logic Design Within Memristive Memories Using Memristor-Aided loGIC (MAGIC) , 2016, IEEE Transactions on Nanotechnology.

[20]  Xin Wang,et al.  Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks , 2017, NIPS.