Energy effective 3D stacked hybrid NEMFET-CMOS caches

In this paper we propose to utilise 3D-stacked hybrid memories as alternative to traditional CMOS SRAMs in L1 and L2 cache implementations and analyse the potential implications of this approach on the processor performance, measured in terms of Instructions-per-Cycle (IPC) and energy consumption. The 3D hybrid memory cell relies on: (i) a Short Circuit Current Free Nano-Electro-Mechanical Field Effect Transistor (SCCF NEMFET) based inverter for data storage; and (ii) adjacent CMOS-based logic for read/write operations and data preservation. We compare 3D Stacked Hybrid NEMFET-CMOS Caches (3DS-HNCC) of various capacities against state of the art 45 nm low power CMOS SRAM counterparts (2D-CC). All the proposed implementations provide two orders of magnitude static energy reduction (due to NEMFET's extremely low OFF current), a slightly increased dynamic energy consumption, while requiring an approximately 55% larger footprint. The read access time is equivalent, while for write operations it is with about 3 ns higher, as it is dominated by the mechanical movement of the NEMFET's suspended gate. In order to determine if the write latency overhead inflicts any performance penalty, we consider as evaluation vehicle a state of the art mobile out-of-order processor core equipped with 32-kB instruction and data L1 caches, and a unified 2-MB L2 cache. We evaluate different scenarios, utilizing both 3DS-HNCC and 2D-CC at different hierarchy levels, on a set of SPEC 2000 benchmarks. Our simulations indicate that for the considered applications, despite of their increased write access time, 3DS-HNCC L2 caches inflict insignificant IPC penalty while providing, on average, 38% energy savings, when compared with 2D-CC. For L1 instruction caches the IPC penalty is also almost insignificant, while for L1 data caches IPC decreases between 1% to 12% were measured.

[1]  Kaustav Banerjee,et al.  Modeling and design of a low-voltage SOI suspended-gate MOSFET (SG-MOSFET) with a metal-over-gate architecture , 2002, Proceedings International Symposium on Quality Electronic Design.

[2]  Anna W. Topol,et al.  3D Fabrication Options for High-Performance CMOS Technology , 2008 .

[3]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[4]  P. T. Balsara,et al.  NEM relay based memory architectures for low power design , 2012, 2012 12th IEEE International Conference on Nanotechnology (IEEE-NANO).

[5]  J. Patel,et al.  Enabling SOI-based assembly technology for three-dimensional (3d) integrated circuits (ICs) , 2005, IEEE InternationalElectron Devices Meeting, 2005. IEDM Technical Digest..

[6]  Shekhar Y. Borkar,et al.  Exponential Challenges, Exponential Rewards - The future of Moore's Law , 2004, IEEE/IFIP International Conference on Very Large Scale Integration of System-on-Chip.

[7]  A. Villaret,et al.  1T MEMS Memory Based on Suspended Gate MOSFET , 2006, 2006 International Electron Devices Meeting.

[8]  Antonios Bazigos,et al.  Ultra low power NEMFET based logic , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[9]  H. Fujiwara,et al.  Which is the best dual-port SRAM in 45-nm process technology? — 8T, 10T single end, and 10T differential — , 2008, 2008 IEEE International Conference on Integrated Circuit Design and Technology and Tutorial.

[10]  Adrian M. Ionescu,et al.  Can SG-FET Replace FET in Sleep Mode Circuits? , 2009, NanoNet.

[11]  H.-S.P. Wong,et al.  Analytical Modeling of the Suspended-Gate FET and Design Insights for Low-Power Logic , 2008, IEEE Transactions on Electron Devices.

[12]  J. Cluzel,et al.  Investigation on TSV impact on 65nm CMOS devices and circuits , 2010, 2010 International Electron Devices Meeting.

[13]  John L. Henning SPEC CPU2000: Measuring CPU Performance in the New Millennium , 2000, Computer.