A Comprehensive Analytical Performance Model of DRAM Caches
暂无分享,去创建一个
[1] Norman P. Jouppi,et al. CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.
[2] Vijayalakshmi Srinivasan,et al. On the Nature of Cache Miss Behavior: Is It √2? , 2008, J. Instr. Level Parallelism.
[3] William J. Dally,et al. Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).
[4] Hsien-Hsin S. Lee,et al. An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[5] Cong Xu,et al. Moguls: A model to explore the memory hierarchy for bandwidth improvements , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).
[6] Babak Falsafi,et al. Die-stacked DRAM caches for servers: hit ratio, latency, or bandwidth? have it all with footprint cache , 2013, ISCA.
[7] Rami G. Melhem,et al. Writeback-aware bandwidth partitioning for multi-core systems with PCM , 2013, Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques.
[8] Somayeh Sardashti,et al. The gem5 simulator , 2011, CARN.
[9] R. Govindarajan,et al. ANATOMY: an analytical model of memory system performance , 2014, SIGMETRICS '14.
[10] Jung Ho Ahn,et al. The Design Space of Data-Parallel Memory Systems , 2006, ACM/IEEE SC 2006 Conference (SC'06).
[11] Tor M. Aamodt,et al. Modeling Cache Contention and Throughput of Multiprogrammed Manycore Processors , 2012, IEEE Transactions on Computers.
[12] Cheng-Chieh Huang,et al. ATCache: Reducing DRAM cache latency via a small SRAM tag cache , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).
[13] Mahmut T. Kandemir,et al. Evaluating STT-RAM as an energy-efficient main memory alternative , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[14] Tor M. Aamodt,et al. Hybrid analytical modeling of pending cache hits, data prefetching, and MSHRs , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[15] J DallyWilliam,et al. Memory access scheduling , 2000 .
[16] Mark D. Hill,et al. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[17] Yan Solihin,et al. CHOP: Adaptive filter-based DRAM caching for CMP server platforms , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[18] Gabriel H. Loh,et al. Fundamental Latency Trade-off in Architecting DRAM Caches: Outperforming Impractical SRAM-Tags with a Simple and Practical Design , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[19] Hyojin Choi,et al. Memory access pattern-aware DRAM performance model for multi-core systems , 2011, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
[20] Tor M. Aamodt,et al. A Hybrid Analytical DRAM Performance Model , 2011 .
[21] Mark D. Hill,et al. A case for direct-mapped caches , 1988, Computer.
[22] Mark Horowitz,et al. An analytical cache model , 1989, TOCS.
[23] Fang Liu,et al. Understanding how off-chip memory bandwidth partitioning in Chip Multiprocessors affects system performance , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.
[24] G. Edward Suh,et al. Analytical cache models with applications to cache partitioning , 2001, ICS '01.
[25] John L. Henning. SPEC CPU2006 benchmark descriptions , 2006, CARN.
[26] Charles D. Pack,et al. The Output of an M/D/1 Queue , 1975, Oper. Res..
[27] R. Plackett,et al. Karl Pearson and the Chi-squared Test , 1983 .