A performance & power comparison of modern high-speed DRAM architectures
暂无分享,去创建一个
Bruce Jacob | Shang Li | Dhiraj Reddy | B. Jacob | D. Reddy | Shang Li
[1] Mehrzad Samadi,et al. Memory-centric system interconnect design with hybrid memory cubes , 2013, PACT 2013.
[2] Norman P. Jouppi,et al. Rethinking DRAM design and organization for energy-constrained multi-cores , 2010, ISCA.
[3] Jinkyu Jeong,et al. A fully associative, tagless DRAM cache , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[4] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[5] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[6] Sally A. McKee,et al. Hitting the memory wall: implications of the obvious , 1995, CARN.
[7] Natalie D. Enright Jerger,et al. Achieving predictable performance through better memory controller placement in many-core CMPs , 2009, ISCA '09.
[8] Mark Oskin,et al. A Software-Managed Approach to Die-Stacked DRAM , 2015, 2015 International Conference on Parallel Architecture and Compilation (PACT).
[9] Lizy Kurian John,et al. Minimalist open-page: A DRAM page-mode scheduling policy for the many-core era , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[10] Darko Živanovič. Memory systems for high-performance computing: the capacity and reliability implications , 2018 .
[11] William J. Dally,et al. Architecting an Energy-Efficient DRAM System for GPUs , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[12] Dongdong Li,et al. Inter-Core Locality Aware Memory Scheduling , 2016, IEEE Computer Architecture Letters.
[13] Lei Liu,et al. BPM/BPM+: Software-based dynamic memory partitioning mechanisms for mitigating DRAM bank-/channel-level interferences in multicore systems , 2014, TACO.
[14] Sandia Report,et al. Toward a New Metric for Ranking High Performance Computing Systems , 2013 .
[15] Onur Mutlu,et al. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[16] Onur Mutlu,et al. Improving DRAM performance by parallelizing refreshes with accesses , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[17] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[18] Trevor N. Mudge,et al. A performance comparison of contemporary DRAM architectures , 1999, ISCA.
[19] J. Thomas Pawlowski,et al. Hybrid memory cube (HMC) , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).
[20] David Roberts,et al. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[21] Jose Renau,et al. Effective Optimistic-Checker Tandem Core Design through Architectural Pruning , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[22] Paul Rosenfeld,et al. Performance Exploration of the Hybrid Memory Cube , 2014 .
[23] Eduard Ayguadé,et al. Another Trip to the Wall: How Much Will Stacked DRAM Benefit HPC? , 2015, MEMSYS.
[24] Tao Zhang,et al. CREAM: A Concurrent-Refresh-Aware DRAM Memory architecture , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[25] Jaeha Kim,et al. Memory-centric system interconnect design with Hybrid Memory Cubes , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[26] Bruce Jacob,et al. Buffer-on-board memory systems , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[27] Sadagopan Srinivasan. Prefetching Vs The Memory System : Optimizations for Multi-core Server Platforms , 2007 .
[28] Alaa R. Alameldeen,et al. Transparent Hardware Management of Stacked DRAM as Part of Memory , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[29] Vijayalakshmi Srinivasan,et al. Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.
[30] Bruce Jacob,et al. Concurrency, latency, or system overhead: which has the largest impact on uniprocessor DRAM-system performance? , 2001, ISCA 2001.
[31] Jaehyuk Huh,et al. Reducing the Memory Bandwidth Overheads of Hardware Security Support for Multi-Core Processors , 2016, IEEE Transactions on Computers.
[32] Aamer Jaleel,et al. DRAMsim: a memory system simulator , 2005, CARN.
[33] Aamer Jaleel,et al. Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Overheads and Scaling , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[34] Babak Falsafi,et al. BuMP: Bulk Memory Access Prediction and Streaming , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[35] Bruce Jacob,et al. DRAMSim2: A Cycle Accurate Memory System Simulator , 2011, IEEE Computer Architecture Letters.
[36] Mahmut T. Kandemir,et al. Evaluating STT-RAM as an energy-efficient main memory alternative , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[37] Bruce Jacob,et al. Fine-Grained Activation for Power Reduction in DRAM , 2010, IEEE Micro.
[38] Natalie D. Enright Jerger,et al. Evaluating the memory system behavior of smartphone workloads , 2014, 2014 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIV).
[39] Daisuke Takahashi,et al. The HPC Challenge (HPCC) benchmark suite , 2006, SC.
[40] Rajeev Balasubramonian,et al. Managing DRAM Latency Divergence in Irregular GPGPU Applications , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.
[41] Radu Sion,et al. DIMMer: A case for turning off DIMMs in clouds , 2014, SoCC.
[42] Carole-Jean Wu,et al. Characterization and Throttling-Based Mitigation of Memory Interference for Heterogeneous Smartphones , 2015, 2015 IEEE International Symposium on Workload Characterization.
[43] Hsien-Hsin S. Lee,et al. Smart Refresh: An Enhanced Memory Controller Design for Reducing Energy in Conventional and 3D Die-Stacked DRAMs , 2007, 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007).
[44] D. Burger,et al. Memory Bandwidth Limitations of Future Microprocessors , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).
[45] Onur Mutlu,et al. Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.
[46] Mary Lou Soffa,et al. DraMon: Predicting memory bandwidth usage of multi-threaded programs with high accuracy and low overhead , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[47] Tor M. Aamodt,et al. Complexity effective memory access scheduling for many-core accelerator architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[48] Steven Przybylski,et al. New DRAM Technologies: A Comprehensive Analysis of the New Architecture , 1994 .