Working with Process Variation Aware Caches

Deep-submicron designs have to take care of process variation effects as variations in critical process parameters result in large variations in access latencies of hardware components. This is severe in the case of memory components as minimum sized transistors are used in their design. In this work, by considering on-chip data caches, we study the effect of access latency variations on performance. We discuss performance losses due to the worst-case design, wherein the entire cache operates with the worst-case process variation delay, followed by process variation aware cache designs which work at set-level granularity. We then propose a technique called block rearrangement to minimize performance loss incurred by a process variation aware cache which works at set-level granularity. Using block rearrangement technique, we rearrange the physical locations of cache blocks such that a cache set can have its "n" blocks (assuming a n-way set-associative cache) in multiple rows instead of a single row as in the case of a cache with conventional addressing scheme. By distributing blocks of a cache set over multiple sets, we minimize the number of sets being affected by process variation. We evaluate our technique using SPEC2000 CPU benchmarks and show that our technique achieves significant performance benefits over caches with conventional addressing scheme.

[1]  Sani R. Nassif,et al.  Modeling and analysis of manufacturing variations , 2001, Proceedings of the IEEE 2001 Custom Integrated Circuits Conference (Cat. No.01CH37169).

[2]  B.C. Paul,et al.  Process variation in embedded memories: failure analysis and variation aware architecture , 2005, IEEE Journal of Solid-State Circuits.

[3]  G.S. Sohi,et al.  Dynamic instruction reuse , 1997, ISCA '97.

[4]  Trevor N. Mudge,et al.  Total power-optimal pipelining and parallel processing under process variations in nanometer technology , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[5]  Edward J. McCluskey,et al.  PADded cache: a new fault-tolerance technique for cache memories , 1999, Proceedings 17th IEEE VLSI Test Symposium (Cat. No.PR00146).

[6]  Andreas Moshovos,et al.  Streamlining inter-operation memory communication via data dependence prediction , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[7]  Yehea I. Ismail,et al.  Thermal Management of On-Chip Caches Through Power Density Minimization , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[8]  Kaushik Roy,et al.  Modeling and testing of SRAM for new failure mechanisms due to process variations in nanoscale CMOS , 2005, 23rd IEEE VLSI Test Symposium (VTS'05).

[9]  James Tschanz,et al.  Parameter variations and impact on circuits and microarchitecture , 2003, Proceedings 2003. Design Automation Conference (IEEE Cat. No.03CH37451).

[10]  Stéphan Jourdan,et al.  Early load address resolution via register tracking , 2000, ISCA '00.

[11]  Hua Wang,et al.  A system-level methodology for fully compensating process variability impact of memory organizations in periodic applications , 2005, 2005 Third IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'05).

[12]  Stamatis Vassiliadis,et al.  A load-instruction unit for pipelined processors , 1993, IBM J. Res. Dev..

[13]  Kaushik Roy,et al.  Reducing set-associative cache energy via way-prediction and selective direct-mapping , 2001, Proceedings. 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34.

[14]  Mikko H. Lipasti,et al.  Value locality and load value prediction , 1996, ASPLOS VII.

[15]  S.R. Nassif Within-chip variability analysis , 1998, International Electron Devices Meeting 1998. Technical Digest (Cat. No.98CH36217).

[16]  Vishwani D. Agrawal,et al.  Essentials of electronic testing for digital, memory, and mixed-signal VLSI circuits [Book Review] , 2000, IEEE Circuits and Devices Magazine.

[17]  Paul S. Zuchowski,et al.  Process and environmental variation impacts on ASIC timing , 2004, ICCAD 2004.

[18]  James E. Smith,et al.  The predictability of data values , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[19]  Henry G. Baker,et al.  Precise instruction scheduling without a precise machine model , 1991, CARN.

[20]  K. Roy,et al.  Modeling and estimation of failure probability due to parameter variations in nano-scale SRAMs for yield enhancement , 2004, 2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525).

[21]  Andreas Moshovos,et al.  Dynamic Speculation and Synchronization of Data Dependences , 1997, ISCA.