ContextPreRF: Enhancing the Performance and Energy of GPUs With Nonuniform Register Access
暂无分享,去创建一个
[1] Jaehyuk Huh,et al. A NUCA Substrate for Flexible CMP Cache Sharing , 2007, IEEE Transactions on Parallel and Distributed Systems.
[2] G. Edward Suh,et al. SRAM-DRAM hybrid memory with applications to efficient register files in fine-grained multi-threading , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[3] Norman P. Jouppi,et al. CACTI 6.0: A Tool to Model Large Caches , 2009 .
[4] Yiran Chen,et al. C1C: A configurable, compiler-guided STT-RAM L1 cache , 2013, TACO.
[5] Cong Xu,et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[6] Kaushik Roy,et al. DWM-TAPESTRI - An energy efficient all-spin cache using domain wall shift based writes , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[7] Henry Wong,et al. Analyzing CUDA workloads using a detailed GPU simulator , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.
[8] Kevin Skadron,et al. Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).
[9] Shunsuke Fukami,et al. Micromagnetic analysis of current driven domain wall motion in nanostrips with perpendicular magnetic anisotropy , 2008 .