Memristors for neural branch prediction: a case study in strict latency and write endurance challenges

Memristors offer many potential advantages over more traditional memory-cell technologies, including the potential for extreme densities, and fast read times. Current devices, however, are plagued by problems of yield, and durability. We present a limit study of an aggressive neural network application that has a high update rate and a strict latency requirement, analog neural branch predictor. Of course, traditional analog neural network (ANN) implementations of branch predictors are not built with the idea that the underlying bits are likely to fail due to both manufacturing and wear-out issues. Without some careful precautions, a direct one-to-one replacement will result in poor behavior. We propose a hybrid system that uses SRAM front-end cache, and a distributed-sum scheme to overcome memristors' limitations. Our design can leverage devices with even modest durability (surviving only hours of continuous switching) to provide a system lasting 5 or more years of continuous operation. In addition, these schemes allow for a fault-tolerant design as well. We find that, while a neural predictor benefits from larger density, current technology parameters do not allow high dense, energy-efficient design. Thus, we discuss a range of plausible memristor characteristics that would; as the technology advances; make them practical for our application.

[1]  André Seznec A 64 Kbytes ISL-TAGE branch predictor , 2011 .

[2]  Daniel A. Jiménez OH-SNAP : Optimized Hybrid Scaled Neural Analog Predictor , 2011 .

[3]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[4]  Bonnie A. Sheriff,et al.  A 160-kilobit molecular electronic memory patterned at 1011 bits per square centimetre , 2007, Nature.

[5]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[6]  J. Yang,et al.  Memristive switching mechanism for metal/oxide/metal nanodevices. , 2008, Nature nanotechnology.

[7]  Chuan Yi Tang,et al.  A 2.|E|-Bit Distributed Algorithm for the Directed Euler Trail Problem , 1993, Inf. Process. Lett..

[8]  Warren Robinett,et al.  Memristor-CMOS hybrid integrated circuits for reconfigurable logic. , 2009, Nano letters.

[9]  Konstantin K. Likharev,et al.  Hybrid CMOS/Nanoelectronic Circuits: Opportunities and Challenges , 2008 .

[10]  John J. Hopfield,et al.  Simple 'neural' optimization networks: An A/D converter, signal decision circuit, and a linear programming circuit , 1986 .

[11]  Yuriy V. Pershin,et al.  Memory effects in complex materials and nanoscale systems , 2010, 1011.3053.

[12]  R. Williams,et al.  Exponential ionic drift: fast switching and low volatility of thin-film memristors , 2009 .

[13]  Jan Craninckx,et al.  A 2.6mW 6b 2.2GS/s 4-times interleaved fully dynamic pipelined ADC in 40nm digital CMOS , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[14]  D. Strukov,et al.  Resistive switching phenomena in thin films: Materials, devices, and applications , 2012 .

[15]  Yiran Chen,et al.  Emerging non-volatile memories: Opportunities and challenges , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[16]  D. Strukov,et al.  Defect-tolerant architectures for nanoelectronic crossbar memories. , 2007, Journal of nanoscience and nanotechnology.

[17]  P. Vontobel,et al.  Writing to and reading from a nano-scale crossbar memory based on memristors , 2009, Nanotechnology.

[18]  Daniel A. Jiménez,et al.  Low-power, high-performance analog neural branch prediction , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.

[19]  Kinam Kim,et al.  A fast, high-endurance and scalable non-volatile memory device made from asymmetric Ta2O(5-x)/TaO(2-x) bilayer structures. , 2011, Nature materials.

[20]  Ligang Gao,et al.  High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm , 2011, Nanotechnology.

[21]  W. Lu,et al.  High-density Crossbar Arrays Based on a Si Memristive System , 2008 .

[22]  Daniel A. Jiménez,et al.  Fast Path-Based Neural Branch Prediction , 2003, MICRO.

[23]  Vijayalakshmi Srinivasan,et al.  Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.

[24]  Daniel A. Jiménez,et al.  Dynamic branch prediction with perceptrons , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[25]  Cong Xu,et al.  Design implications of memristor-based RRAM cross-point structures , 2011, 2011 Design, Automation & Test in Europe.

[26]  Daniel A. Jiménez,et al.  Piecewise linear branch prediction , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[27]  W. Lu,et al.  CMOS compatible nanoscale nonvolatile resistance switching memory. , 2008, Nano letters.

[28]  Dmitri B Strukov,et al.  Four-dimensional address topology for circuits with stacked multilayer crossbar arrays , 2009, Proceedings of the National Academy of Sciences.

[29]  Daniel A. Jiménez,et al.  Neural methods for dynamic branch prediction , 2002, TOCS.

[30]  K.K. Likharev,et al.  Reconfigurable Hybrid CMOS/Nanodevice Circuits for Image Processing , 2007, IEEE Transactions on Nanotechnology.

[31]  Inkyu Park,et al.  Sub-10 nm nanoimprint lithography by wafer bowing. , 2008, Nano letters.

[32]  Boris Murmann,et al.  A/D converter trends: Power dissipation, scaling and digitally assisted architectures , 2008, 2008 IEEE Custom Integrated Circuits Conference.

[33]  Engin Ipek,et al.  Dynamically replicated memory: building reliable systems from nanoscale resistive memories , 2010, ASPLOS XV.

[34]  I. Baek,et al.  High‐Current‐Density CuO x/InZnOx Thin‐Film Diodes for Cross‐Point Memory Applications , 2008 .

[35]  Daniel A. Jimenez Piecewise Linear Branch Prediction , 2005, ISCA 2005.

[36]  Vijayalakshmi Srinivasan,et al.  Enhancing lifetime and security of PCM-based Main Memory with Start-Gap Wear Leveling , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[37]  André Seznec,et al.  Analysis of the O-GEometric history length branch predictor , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[38]  Engin Ipek,et al.  Dynamically replicated memory: building reliable systems from nanoscale resistive memories , 2010, ASPLOS 2010.

[39]  R. Williams,et al.  Sub-nanosecond switching of a tantalum oxide memristor , 2011, Nanotechnology.