SBAC: A statistics based cache bypassing method for asymmetric-access caches

Asymmetric-access caches with emerging technologies, such as STT-RAM and RRAM, have become very competitive designs recently. Since the write operations consume more time and energy than read ones, data should bypass an asymmetric-access cache unless the locality can justify the data allocation. However, the asymmetric-access property is not well addressed in prior bypassing approaches, which are not energy efficient and induce non-trivial operation overhead. To overcome these problems, we propose a cache bypassing method, SBAC, based on data locality statistics of the whole cache rather than a single cache line's signature. We observe that the decision-making of SBAC is highly accurate and the optimization technique for SBAC works efficiently for multiple applications running concurrently. Experiments show that SBAC cuts down overall energy consumption by 22.3%, and reduces execution time by 8.3%. Compared to prior approaches, the design overhead of SBAC is trivial.

[1]  Chyi-Chang Miao,et al.  Compiler managed micro-cache bypassing for high performance EPIC processors , 2002, MICRO.

[2]  Wen-mei W. Hwu,et al.  Run-Time Cache Bypassing , 1999, IEEE Trans. Computers.

[3]  Cong Xu,et al.  Design implications of memristor-based RRAM cross-point structures , 2011, 2011 Design, Automation & Test in Europe.

[4]  Cong Xu,et al.  NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[5]  Kaushik Roy,et al.  An alternate design paradigm for robust spin-torque transfer magnetic RAM (STT MRAM) from circuit/architecture perspective , 2009, ASP-DAC.

[6]  Amin Jadidi,et al.  High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement , 2011, IEEE/ACM International Symposium on Low Power Electronics and Design.

[7]  Mircea R. Stan,et al.  Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[8]  Yan Solihin,et al.  Counter-Based Cache Replacement and Bypassing Algorithms , 2008, IEEE Transactions on Computers.

[9]  Kei Hiraki,et al.  Inter-reference gap distribution replacement: an improved replacement algorithm for set-associative caches , 2004, ICS '04.

[10]  Per Stenström,et al.  Enhancing Last-Level Cache Performance by Block Bypassing and Early Miss Determination , 2006, Asia-Pacific Computer Systems Architecture Conference.

[11]  Yu Wang,et al.  Improving energy efficiency of write-asymmetric memories by log style write , 2012, ISLPED '12.

[12]  Mainak Chaudhuri,et al.  Bypass and insertion algorithms for exclusive last-level caches , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[13]  Michael F. P. O'Boyle,et al.  IATAC: a smart predictor to turn-off L2 cache lines , 2005, TACO.

[14]  M. Hosomi,et al.  A novel nonvolatile memory with spin torque transfer magnetization switching: spin-ram , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..

[15]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[16]  Xiaoxia Wu,et al.  Hybrid cache architecture with disparate memory technologies , 2009, ISCA '09.

[17]  Luis A. Lastras,et al.  PreSET: Improving performance of phase change memories by exploiting asymmetry in write times , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[18]  Yiran Chen,et al.  A novel architecture of the 3D stacked MRAM L2 cache for CMPs , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[19]  Samira Manabi Khan,et al.  Sampling Dead Block Prediction for Last-Level Caches , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[20]  Hai Li,et al.  Process variation aware data management for STT-RAM cache design , 2012, ISLPED '12.

[21]  Babak Falsafi,et al.  Dead-block prediction & dead-block correlating prefetchers , 2001, ISCA 2001.

[22]  Jun Yang,et al.  Energy reduction for STT-RAM using early write termination , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.